Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
In this paper, we propose a new low bit rate speech coding algorithm using a fully vector quantized ARM A analysis combined with glottal model(FVQ-GARMA). Some coding algorithm which estimate spectral parameters and glottal model parameters using Analysis-by-Synthesis (A-b-S) method was proposed to synthesize natural sounding speech. However, these conventional A-b-S methods have some problems - enormous computational load, unstable estimation, high coding bit rate because of pitch synchronous parameter quantization. To solve above problems, in the encoder of FVQ-GARMA, the analysis and the vector quantization are done simultaneously. A set of vector codes (AR,MA and glottal model) that minimizes the distortion between resultant synthetic speech and input speech is selected at every frame. In the decoder, synthetic speech is generated from parameters obtained by interpolating the set of vector codes at every pitch period. Pre-selections to reduce computational load are reported. New glottal model which is represented by 5 parameters in time-domain and easy to control glottal waveform is proposed. We show synthetic speech quality of 2.4 Kbps FVQ-GARMA is equal to that of 4.8 Kbps conventional CELP in subjective tests, and FVQ-GARMA can reduce computational load.
Bibliographic reference. Seza, Katsushi / Tasaki, Hirohisa / Takahashi, Shinya (1992): "Fully vector quantized arm a analysis combined with glottal model for low bit rate coding", In ICSLP-1992, 29-32.