EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Speaking Rate Dependent Acoustic Modeling for Spontaneous Lecture Speech Recognition

Hiroaki Nanjo, Kazuomi Kato, Tatsuya Kawahara

Kyoto University, Japan

The paper addresses large vocabulary spontaneous speech recognition focusing on acoustic modeling that considers the speaking rate. Using the real lecture speech corpus collected under the priority research project in Japan, we have made baseline acoustic model, and evaluated on the automatic transcription of oral presentations by experienced speakers and obtained word accuracy of 58.2%. Compared with read speech, we have observed significant difference in the speaking rate. To handle fast and poorly articulated phone segments, several extensions of the modeling are explored. Specifically, we introduce state-skipping modeling, speech rate-dependent model, and syllable sub-word modeling. As a result, we reduced the word error rate by absolute 0.8%-2.0%. We also address a language modeling especially on effective use of various large text corpora.

Full Paper

Bibliographic reference.  Nanjo, Hiroaki / Kato, Kazuomi / Kawahara, Tatsuya (2001): "Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition", In EUROSPEECH-2001, 2531-2534.