Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Spectral Mapping for Voice Conversion Using Speaker Selection and Vector Field Smoothing

Makoto Hashimoto, Norio Higuchi

ATR Interpreting Telecommunications Research Labs., Soraku-gun, Kyoto, Japan

This paper proposes a spectral mapping method for voice conversion using speaker selection and vector field smoothing. With this method, the spectral distance between transformed speech and the speech of a target speaker was reduced by 25% in mean value for eight target speakers (four males and four females) in comparison with the distance between the speech of a speaker who was selected from among multiple reference speakers and that of a target speaker using only one word as training data. Transformed speech samples of one male and one female were judged as closer to their target speakers than their selected speakers by 67% and 65%, respectively, in a hearing test.

