First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

The TDNN-LR Large-Vocabulary and Continuous Speech Recognition System

Hidefumi Sawai

ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

This paper describes an integration of speech recognition and language processing. The speech recognition part consists of the Large Phonemic Time-Delay Neural Networks (TDNN) which can automatically spot all 24 Japanese phonemes by simply scanning among an input speech. The language processing part is made up of a predictive LR parser which predicts subsequent phonemes based on the currently processed phonemes. We call this 'hybrid' integrated recognition system 'TDNN-LR' method. The TDNN-LR recognition system provides large-vocabulary and continuous speech recognition. Two kinds of recognition experiments i.e., large-vocabulary isolated word recognition and continuous speech recognition were performed using the TDNN-LR method. Speaker-dependent recognition rates of 92.6% for the first choices and 97.6% for the top two choices were obtained for 5,240 Japanese common words, and rates of 65.1% for the first choices and 88.8% within the fifth choices were attained for phrase recognition.

Full Paper

Bibliographic reference.  Sawai, Hidefumi (1990): "The TDNN-LR large-vocabulary and continuous speech recognition system", In ICSLP-1990, 1349-1352.