INTERSPEECH 2006 - ICSLP
This paper describes a novel framework of voice conversion (VC). We call it eigenvoice conversion (EVC). We apply EVC to the conversion from a source speakerís voice to arbitrary target speakersí voices. Using multiple parallel data sets consisting of utterance-pairs of the source and multiple pre-stored target speakers, a canonical eigenvoice GMM (EV-GMM) is trained in advance. That conversion model enables us to flexibly control the speaker individuality of the converted speech by manually setting weight parameters. In addition, the optimum weight set for a specific target speaker is estimated using only speech data of the target speaker without any linguistic restrictions. We evaluate the performance of EVC by a spectral distortion measure. Experimental results demonstrate that EVC works very well even if we use only a few utterances of the target speaker for the weight estimation.
Bibliographic reference. Toda, Tomoki / Ohtani, Yamato / Shikano, Kiyohiro (2006): "Eigenvoice conversion based on Gaussian mixture model", In INTERSPEECH-2006, paper 1717-Thu2A3O.5.