First International Conference on Spoken Language Processing (ICSLP 90)
This paper describes an integration of speech recognition and language processing. The speech recognition part consists of the Large Phonemic Time-Delay Neural Networks (TDNN) which can automatically spot all 24 Japanese phonemes by simply scanning among an input speech. The language processing part is made up of a predictive LR parser which predicts subsequent phonemes based on the currently processed phonemes. We call this 'hybrid' integrated recognition system 'TDNN-LR' method. The TDNN-LR recognition system provides large-vocabulary and continuous speech recognition. Two kinds of recognition experiments i.e., large-vocabulary isolated word recognition and continuous speech recognition were performed using the TDNN-LR method. Speaker-dependent recognition rates of 92.6% for the first choices and 97.6% for the top two choices were obtained for 5,240 Japanese common words, and rates of 65.1% for the first choices and 88.8% within the fifth choices were attained for phrase recognition.
Bibliographic reference. Sawai, Hidefumi (1990): "The TDNN-LR large-vocabulary and continuous speech recognition system", In ICSLP-1990, 1349-1352.