EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Towards The Creation of Acoustic Models for Stressed Japanese Speech

Kozo Okuda, Tomoko Matsui, Satoshi Nakamura

ATR Spoken Language Translation Research Laboratories, Japan

In error recovery utterance, the user using the speech recognition system changes his or her speaking style to aid the system in recognizing the speech. However, this change leads the mismatch between the acoustic models and reduces the performance of the system. This degradation causes a serious problem of speech recognition for a dialog system or a speech translation system. In error recovery utterance in Japanese, the occurrence of syllable-stressed speech increases. In syllable-stressed speech, each syllable is uttered slowly and emphasized. The characteristics of each syllable are strongly altered by this modification and the speech recognition performance is reduced. This paper investigates how to create acoustic models robust in recognizing error recovery utterances, especially syllable-stressed speech. In this paper, we propose an acoustic modeling method for syllable-stressed speech by combining existing acoustic models. Our results indicate that the proposed method improves the system performance. Furthermore, the method does not need any expansion of the recognition dictionary or explicit model selection.

Full Paper

Bibliographic reference.  Okuda, Kozo / Matsui, Tomoko / Nakamura, Satoshi (2001): "Towards the creation of acoustic models for stressed Japanese speech", In EUROSPEECH-2001, 1653-1656.