INTERSPEECH 2006 - ICSLP
This study presents an approach to GMM-based speech conversion using maximum a posteriori probability (MAP) adaptation. First, a conversion function is trained using a parallel corpus containing the same utterances spoken by both the source and the reference speakers. Then a non-parallel corpus from a new target speaker is used for the adaptation of the conversion function which models the voice conversion between the source speaker and the new target speaker. The consistency among the adaptation data is estimated to select suitable data from the non-parallel corpus for MAP-based adaptation of the GMMs. In speech conversion evaluation, experimental results show that MAP adaptation using a small non-parallel corpus can reduce the conversion error and improve the speech quality for speaker identification compared to the method without adaptation. Objective and subjective tests also confirm the promising performance of the proposed approach.
Bibliographic reference. Lee, Chung-Han / Wu, Chung-Hsien (2006): "Map-based adaptation for speech conversion using adaptation data selection and non-parallel training", In INTERSPEECH-2006, paper 1164-Thu1BuP.2.