First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

On the Robustness of HMM and ANN Speech Recognition Algorithms

Yasuhiro Minami (1), Toskiyuki Hanazawa (2), Hitoshi Iwamida (3), Erik McDermott (3), Kiyohiro Shikano (4), Shigeru Katagiri (3), Masaona Kagawa (1)

(1) Keio University; (2) Mitubishi Electric Corporation; (3) ATR Auditory and Visual Percepton Research Laboratories; (4) NTT Human Interface Laboratories; Japan

The robustness of HMM and ANN speech recognizers is studied in the speaking mode-independent situation where the recognizers are trained using phoneme tokens extracted from isolated word speech data and tested on phonemes taken from both isolated word speech data and continuous speech data. In this situation there is considerable variation in phoneme identity between training and testing data. We examined six recognizers: a discrete HMM, a continuous HMM, TDNN, Shift-tolerant LVQ2, an LVQ-HMMhybrid algorithm, and a Fuzzy LVQ-HMM hybrid algorithm. The experiment results mainly show the following two points: 1) the recognizers with great discriminative power on the isolated-mode testing data do not necessarily perform highly on the continuous-mode testing data, and 2) algorithms such as Shift-tolerant LVQ2 and Fuzzy LVQ-HMM (which integrates Fuzzy VQ into LVQ-HMM) may be able to achieve robust as well as accurate recognition- i.e. perform well on both continuous and isolated mode data.

Full Paper

Bibliographic reference.  Minami, Yasuhiro / Hanazawa, Toskiyuki / Iwamida, Hitoshi / McDermott, Erik / Shikano, Kiyohiro / Katagiri, Shigeru / Kagawa, Masaona (1990): "On the robustness of HMM and ANN speech recognition algorithms", In ICSLP-1990, 1345-1348.