The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

GMM-PCA based Speaker-Timbre Conversion on Full-Quality Speech

Fernando Villavicencio, Esteban Maestre

Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain

This work addresses a study of the GMM-based approach to achieve full-quality speaker timbre conversion. In general, high-quality voice conversion requires accurate spectral envelope estimates, resulting in high-dimensional feature vectors and relatively high computational. Aiming to achieve lowdimensional processing, accurate envelope estimates of the speakers are mel-frequency scaled and projected onto the space defined by a subset of the principal components. The GMMbased features conversion is then performed in the reduced space. Our experimental findings confirm that this strategy provides benefits, especially observed on the resulting converted speech quality, with a significant computational cost reduction.

Index Terms: Speech synthesis, speech analysis, linear prediction, pattern recognition

Bibliographic reference.  Villavicencio, Fernando / Maestre, Esteban (2010): "GMM-PCA based speaker-timbre conversion on full-quality speech", In SSW7-2010, 56-61.