First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Statistical Study on Voice individuality Conversion across Different Languages

Masanobu Abe, Shigeki Sagayama

ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

In this paper we discuss spectrum differences between different languages. This research is motivated by a "cross-language voice conversion". The goal of cross-language voice conversion is to preserve the individuality of a speaker's speech when that speaker's utterances are translated and used to synthesize speech in another language. To investigate the spectrum difference caused by language differences, the speech uttered by a bilingual speaker is analyzed. Experimental results are as follows: (l)the size of spectrum space is smaller in the inter-language case (between English and Japanese) than in the inter-speaker case, (2)the overlap of inter-language spectrum space is larger than the overlap of inter-speaker, (3)the unique spectra in English are /r/,/se/,/s/,/f/, and the unique spectra in Japanese are /i/,/u/,/N/, (4)although there is a critical boundary between the unique spectra in English and Japanese spectrum space, Japanese spectrum space contains some spectra which are close to the unique spectra in English, (5) judging from listening tests, the dynamic characteristics also play an important role to characterize a language.

Bibliographic reference.  Abe, Masanobu / Sagayama, Shigeki (1990): "Statistical study on voice individuality conversion across different languages", In ICSLP-1990, 157-160.