EUROSPEECH '91

Previous work has shown the ability of Artificial Neural Networks (ANN), and Multilayer Perceptrons (MLPs) in particular, to estimate a posteriori probabilities that can be used, after division by the a priori probabilities of the classes, as emission probabilities for Hidden Markov Models (HMMs). The advantages of a speech recognition system incorporating both MLPs and HMMs are the best discrimination and the ability to incorporate multiple sources of evidence (features, temporal context) without restrictive assumptions of distributions or statistical independence. While this approach has been shown useful for speech recognition, it is still important to understand the underlying problems and limitations and to consider its consequences on other algorithms. For example, while state of the art HMMbased speech recognizers now model contextdependent phonetic units such as triphones instead of phonemes to improve their performance, most of the MLPbased approaches are restricted to phoneme models. After a short review, it is shown here how such neural network approaches can be generalized to contextdependent phoneme models. Also, it is discussed how previous theoretical results can affect the development of other algorithms like nonlinear Autoregressive (AR) Models and Radial Basis Functions (RBFs).
Bibliographic reference. Bourlard, Herve (1991): "Neural nets and hidden Markov models: review and generalizations", In EUROSPEECH1991, 363369.