4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

An Excitation Synchronous Pitch Waveform Extraction Method and its Application to the VCV-Concatenation Synthesis of Japanese Spoken Words

Yasuhiko Arai (1), Ryo Mochizuki (2), Hirofumi Nishimura (1), Takashi Honda (2)

(1) AVC Development Lab, Matsushita Communication Industrial Co., Ltd., Yokohama, Japan
(2) School of Science and Technology, Meiji University, Kawasaki, Japan

A novel pitch waveform extraction method has been proposed. Being different from the conventional pitch mark decision algorithm such as the peak search method, this new algorithm decides excitation points based on the Phase Equalized Residual Excited Linear Prediction (PE-RELP) model. A pitch waveform is extracted from two adjacent excitation intervals by using the asymmetrical Hanning window. The new pitch waveform extraction method takes advantages of being free from the extraction errors caused by the formant resonance and being fully automatic. Therefore, no manipulation is required and no roughness is heard in the pitch modified speech sound. The superiority of the new method has been ensured by means of the spectral distortion measurement and the subjective quality evaluation. Finally, the spoken word generation by means of the VCV-waveform concatenation is demonstrated. Consequently, it has been shown that the generation of very natural sounding spoken words is possible.

