First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

On Nonstationary Hidden Markov Modeling of Speech Signals

A. J. Serralheiro (1), Y. Ephraim (2), Lawrence R. Rabiner (2)

(1) INESC and IST, Lisbon, Portugal
(2) AT&T Bell Laboratories, Murray Hill, New Jersey, USA

We propose an exact maximum likelihood (ML) approach for hidden Markov modeling of speech signals using models with mixtures of Gaussian autoregressive (AR) output probability distributions. This approach differs from the commonly used approach in two aspects. First, the parameters of the AR models are calculated using the exact, rather than the asymptotic, form of the likelihood function. Second, the gain of each AR model as well as its shape is estimated and used during the recognition phase. Since the asymptotic likelihood is appropriate only for sources which are stationary in some sense, the ML approach taken here can be considered as an approach for nonstationary modeling. The proposed approach was tested on the task of recognizing isolated versions of the English alphabet spoken by four different speakers by a system which was simultaneously trained for the four talkers (multi-speaker recognizer). This approach results in a recognition accuracy which is comparable to that obtained by the asymptotic ML approach.

Full Paper

Bibliographic reference.  Serralheiro, A. J. / Ephraim, Y. / Rabiner, Lawrence R. (1989): "On nonstationary hidden Markov modeling of speech signals", In EUROSPEECH-1989, 1159-1162.