4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

A Novel Approach to the Estimation of Voice Source and Vocal Tract Parameters from Speech Signals

Wen Ding, Hideki Kasuya

Faculty of Engineering, Utsunomiya University, Utsunomiya, Japan

This paper presents a novel adaptive pitch-synchronous analysis method for simultaneous estimation of voice source and vocal tract (formant / antiformant) parameters from the speech signal. The method uses a parametric Rosenberg-Klatt model to generate a glottal waveform and an aut degressive with exogenous input (ARX) model for representing speech production process. The time-varying coefficients of the model are estimated with an adaptive algorithm based on Kalman filter, while the parameters of the Rosenberg-Klatt model are optimized using the simulated annealing method. In addition, a new hybrid error criterion is used to optimize the glottal opening instant. Furthermore, in order to estimate the fundamental period parameter To. it is defined as two successive glottal closure instants, and is estimated automatically based on the obtained differentiated glottal waveforms. Experiments using two-channel speech signals (speech and electroglottograph (EGG) signal) and continuous speech show a good estimation performance.

