Sixth ISCA Workshop on Speech Synthesis

Bonn, Germany
August 22-24, 2007

Regression Approaches to Voice Quality Controll Based on One-to-Many Eigenvoice Conversion

Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

Graduate School of Information Science, Nara Institute of Science and Technology, Japan

This paper proposes techniques for flexibly controlling voice quality of converted speech from a particular source speaker based on one-to-many eigenvoice conversion (EVC). EVC realizes a voice quality control based on the manipulation of a small number of parameters, i.e., weights for eigenvectors, of an eigenvoice Gaussian mixture model (EV-GMM), which is trained with multiple parallel data sets consisting of a single source speaker and many pre-stored target speakers. However, it is difficult to control intuitively the desired voice quality with those parameters because each eigenvector doesn’t usually represent a specific physical meaning. In order to cope with this problem, we propose regression approaches to the EVC-based voice quality controller. The tractable voice quality control of the converted speech is achieved with a low-dimensional voice quality control vector capturing specific voice characteristics. We conducted experimental verifications of each of the proposed approaches.

Full Paper   Presentation (ppt)

Bibliographic reference.  Ohta, Kumi / Ohtani, Yamato / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro (2007): "Regression approaches to voice quality controll based on one-to-many eigenvoice conversion", In SSW6-2007, 101-106.