First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

The Use of Discriminant Neural Networks in the Integration of Acoustic Cues for Voicing Into A Continuous-Word Recognition System

Claude Lefebvre, Dariusz A. Zwierzynski

Speech Research Centre, National Research Council of Canada, Building U-61, Montreal Road, Ottawa, Ontario, Canada

The performance of a small vocabulary speaker-dependent robust speech recogniser can be improved by adding more input features in the front-end. Our present speech recognition system employs both static & dynamic spectral representations which are combined with a linear discriminant analysis. We have done recognition experiments with CVC words, differing in their initial consonant phonemes only, e.g. peep vs beep and found that most of the errors are due to the system not distinguishing between voiceless/voiced stop consonants. There are a number of acoustic cues useful to improve distinction between voiceless/voiced plosives, specifically, the fundamental frequency at voicing onset and the Voice Onset Time (VOT). This paper reports on recognition experiments where both of these features are extracted from the speech signal and are combined with the other features using the linear discriminant network. The results from the experiments confirmed that the addition of these two input features improved the performance of the recogniser for confusable word-pairs.

Full Paper

Bibliographic reference.  Lefebvre, Claude / Zwierzynski, Dariusz A. (1990): "The use of discriminant neural networks in the integration of acoustic cues for voicing into a continuous-word recognition system", In ICSLP-1990, 1073-1076.