First International Conference on Spoken Language Processing (ICSLP 90)
Our aim is to improve text-to-speech in its naturalness and its ability to model individual speakers. This paper describes various methods for using inverse-filtered waveforms from natural speech as a voice source in a text-to-speech system. One method uses a repeating loop, and controls pitch by interpolating samples in the waveform. Another method creates a source waveform of the desired pitch by concatenating single pulses from a collection of pulses. Listening tests were carried out to compare these methods with each other and with more traditional voice source generation techniques. The results indicate that these "natural glottal source" methods can substantially improve the quality of text-to-speech synthesis.
Bibliographic reference. Pearson, Stephen D. / Javkin, Hector R. / Matsui, Kenji / Kamai, Takahiro (1990): "Text-to-speech synthesis using a natural voice source", In ICSLP-1990, 193-196.