First International Conference on Spoken Language Processing (ICSLP 90)
We propose a new method for speech synthesis that concatenates waveforms selected from a waveform dictionary. The method uses a modified PSOLA technique to alter the pitch of waveforms selected from a dictionary. The limits of acceptable pitch shifts are determined by preference tests. To make segment selection more accurate, we introduce a new factor which considers the spectral continuity across voiced phoneme boundaries. The average spectral difference is reduced from 5.4dB to 2.7dB and the synthesized voice is more fluent.
Bibliographic reference. Hirokawa, Tomohisa / Hakoda, Kazuo (1990): "Segment selection and pitch modification for high quality speech synthesis using waveform segments", In ICSLP-1990, 337-340.