The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

On Transforming Spectral Peaks in Voice Conversion

Elizabeth Godoy (1), Olivier Rosec (1), Thierry Chonavel (2)

(1) Orange Labs R&D TECH/ASAP/VOICE, Lannion, France
(2) Télécom Bretagne, Signal & Communications Department, Brest, France

This paper explores the benefits of transforming spectral peaks in voice conversion. First, in examining classic GMM-based transformation with cepstral coefficients, we show that the lack of transformed data variance ("over-smoothing") can be related to the choice of spectral parameterization. Consequently, we propose an alternative parameterization using spectral peaks. The peaks are transformed using HMMs with Gaussian state distributions. Two learning variants and post-processing treating peak evolution in time are also examined. In comparing the different transformation approaches, spectral peaks are shown to offer higher interspeaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts.

Index Terms: voice conversion, spectral transformation, spectral peaks

Full Paper

Bibliographic reference.  Godoy, Elizabeth / Rosec, Olivier / Chonavel, Thierry (2010): "On transforming spectral peaks in voice conversion", In SSW7-2010, 68-73.