Sixth ISCA Workshop on Speech Synthesis

Bonn, Germany
August 22-24, 2007

An Evaluation of Many-to-One Voice Conversion Algorithms with Pre-Stored Speaker Data Sets

Daisuke Tani, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

Graduate School of Information Science, Nara Institute of Science and Technology, Japan

This paper describes an evaluation of many-to-one voice conversion (VC) algorithms converting an arbitrary speakerís voice into a particular target speakerís voice. These algorithms effectively generate a conversion model for a new source speaker using multiple parallel data sets of many pre-stored source speakers and the single target speaker. We conducted experimental evaluations for demonstrating the conversion performance of each of the many-to-one VC algorithms, including not only the conventional algorithms based on a speaker independent GMM and on eigenvoice conversion (EVC), but also new algorithms based on speaker selection and on EVC with speaker adaptive training (SAT). As a result, it is shown that an adaptation process of the conversion model improves significantly conversion performance, and the algorithm based on speaker selection works well even when using a very limited amount of adaptation data.

Full Paper   Presentation (ppt)

Bibliographic reference.  Tani, Daisuke / Ohtani, Yamato / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro (2007): "An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets", In SSW6-2007, 107-112.