Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Speech Recognition Using Stochastic Explicit-Segment Modeling

Hong C. Leung, I. Lee Hetherington, Victor W. Zue

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

In this paper, we present a stochastic explicit-segment modeling (SESM) approach to speech recognition. This approach can be characterized by three major problems: (1) estimation of the probability of a segmentation is formulated as a boundary classification problem, (2) estimation of the probability of a phonetic unit in a segment is treated as a phonetic classification problem, and (3) boundaries and segments used for problems (1) and (2) are proposed by stochastic segmentation. In our current implementation, artificial neural networks are used to deal with the first two problems. We have experimented with SESM on a task of recognizing 25 words (city names) recorded from actual customers over the telephone network. Performance evaluation shows that our approach achieves a recognition accuracy over 93%, or about 99% at a rejection rate of 20%.

Full Paper

Bibliographic reference.  Leung, Hong C. / Hetherington, I. Lee / Zue, Victor W. (1991): "Speech recognition using stochastic explicit-segment modeling", In EUROSPEECH-1991, 931-934.