Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Map-Based Adaptation for Speech Conversion Using Adaptation Data Selection and Non-Parallel Training

Chung-Han Lee, Chung-Hsien Wu

National Cheng Kung University, Taiwan

This study presents an approach to GMM-based speech conversion using maximum a posteriori probability (MAP) adaptation. First, a conversion function is trained using a parallel corpus containing the same utterances spoken by both the source and the reference speakers. Then a non-parallel corpus from a new target speaker is used for the adaptation of the conversion function which models the voice conversion between the source speaker and the new target speaker. The consistency among the adaptation data is estimated to select suitable data from the non-parallel corpus for MAP-based adaptation of the GMMs. In speech conversion evaluation, experimental results show that MAP adaptation using a small non-parallel corpus can reduce the conversion error and improve the speech quality for speaker identification compared to the method without adaptation. Objective and subjective tests also confirm the promising performance of the proposed approach.

Full Paper

Bibliographic reference.  Lee, Chung-Han / Wu, Chung-Hsien (2006): "Map-based adaptation for speech conversion using adaptation data selection and non-parallel training", In INTERSPEECH-2006, paper 1164-Thu1BuP.2.