5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Do Phonetic Features Help to Improve Consonant Identification in ASR?

Jacques Koreman, Bistra Andreeva, William J. Barry

University of the Saarland, Institute of Phonetics, Germany

The hidden Markov modelling experiments presented in this paper show that consonant identification results can be improved substantially if a neural network is used to extract linguistically relevant information from the acoustic signal before applying hidden Markov modelling. The neural network - or in this case a combination of two Kohonen networks - takes 12 mel-frequency cepstral coefficients, overall energy and the corresponding delta parameters as input and outputs distinctive phonetic features, like [(plus-minus)uvular] and [(plus-minus)plosive]. Not only does this preprocessing of the data lead to better consonant identification rates, the confusions that occur between the consonants are less severe from a phonetic viewpoint, as is demonstrated. One reason for the improved consonant identification is that the acoustically variable consonant realisations can be mapped onto identical phonetic features by the neural network. This makes the input to hidden Markov modelling more homogenous and improves consonant identification. Furthermore, by using phonetic features the neural network helps the system to focus on linguistically relevant information in the acoustic signal.

Full Paper

Bibliographic reference.  Koreman, Jacques / Andreeva, Bistra / Barry, William J. (1998): "Do phonetic features help to improve consonant identification in ASR?", In ICSLP-1998, paper 0549.