5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Speech Recognition Using HMM-State Confusion Characteristics

Yumi Wakita, Harald Singer, Yoshinori Sagisaka

ATR Interpreting Telecommunications Research Laboratories, Kyoto, Japan

In our previous work, we proposed a re-entry modeling of missing phonemes which are lost during search process. In the reentry modeling, the recognition results are postprocessed and originally recognized phoneme sequences are converted to new phoneme sequences using HMM-state confusion characteristics spanning several phonemes. We confirmed that HMM-state confusions are effective for the re-entry modeling. In this paper, we propose a re- entry modeling during recognition using a multiple pronunciation dictionary where pronunciations are added using HMM-state confusion characteristics. The pronunciations are added considering part-of-speech (POS) dependency of confusion characteris- tics. As a result of continuous recognition experiments, we confirmed that the following two points are effective to improve word recognition rates: (1) confusions are expressed by HMM-state sequences, (2) pronunciations are added considering part-of-speech dependency of confusion characteristics. they cannot cope with the confusion in consideration of the previous and following context of misrecognized sequences.

Full Paper

Bibliographic reference.  Wakita, Yumi / Singer, Harald / Sagisaka, Yoshinori (1997): "Speech recognition using HMM-state confusion characteristics", In EUROSPEECH-1997, 7-10.