INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Inverse Filtering Based Harmonic Plus Noise Excitation Model for HMM-Based Speech Synthesis

Zhengqi Wen, Jianhua Tao

Chinese Academy of Sciences, China

In this paper, a new Voicing Cut-Off Frequency (VCO) estimation method based on inverse filtering is presented. The spectrum of residual signal got from inverse filtering is split into sub-bands which are clustered into two classes by using K-means algorithm. And then, the Viterbi algorithm is used to search a smoothed VCO contour. Based on this new VCO estimation method, an adaptation of Harmonic Noise Model is also proposed to reconstruct the residual signal with both harmonic and noise components. The proposed excitation model can reduce the buzziness of speech generated by normal vocoders using simple pulse train, and has been integrated into a HMM-based speech synthesis system (HTS). The listening test showed that the HTS with our new method gives better quality of synthesized speech than the traditional HTS which only uses simple pulse train excitation model.

Full Paper

Bibliographic reference.  Wen, Zhengqi / Tao, Jianhua (2011): "Inverse filtering based harmonic plus noise excitation model for HMM-based speech synthesis", In INTERSPEECH-2011, 1805-1808.