5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Continuous Speech Recognition Using a Context Sensitive ANN and HMM2s

Nicolas Pican, Jean-Francois Mari, Dominique Fohr

CRIN-CNRS & INRIA Lorraine, Vandoeuvre-les-Nancy, France

The phonetic context has a large effect on phonemes in a continuous speech signal [1]. Therefore recognition systems that model allophones using context-dependent Hidden Markov Models have been implemented [4]. Second-order HMMs (HMM2s have a great ability for the segmentation in the temporal domain [6][7] but have some difficulties in the recognition because the MLE training (Maximum Likelihood Estimation) is not discriminant, whereas the discrimination is one of the abilities of the Artificial Neural Networks models. In the last three years we have developed a new ANN model named OWE (Orthogonal Weight Estimator)[10][11]. The principle of the OWE is a ANN that classifies an input pattern according to contextual environment. This new ANN architecture tackles the problem of context dependent behaviour training. Roughly, the principle is based on main MLP (Multilayered Perceptron) in which each synaptic weight connection value is estimated by another MLP (an OWE) with respect to context representation. In this paper, we present 2 hybrid systems for phoneme recognition. In both systems, 48 context independent HMM2s segment the input signal. In the first system, the OWE performs the labelling of segments and, in the second system, the OWE outputs are the input frames of the HMM2s. Experiments on TIMIT range from 56% to 67% accuracies on the 48 phonemes set.

Full Paper

Bibliographic reference.  Pican, Nicolas / Mari, Jean-Francois / Fohr, Dominique (1997): "Continuous speech recognition using a context sensitive ANN and HMM2s", In EUROSPEECH-1997, 95-98.