EUROSPEECH '91

In this paper, we present a stochastic explicitsegment modeling (SESM) approach to speech recognition. This approach can be characterized by three major problems: (1) estimation of the probability of a segmentation is formulated as a boundary classification problem, (2) estimation of the probability of a phonetic unit in a segment is treated as a phonetic classification problem, and (3) boundaries and segments used for problems (1) and (2) are proposed by stochastic segmentation. In our current implementation, artificial neural networks are used to deal with the first two problems. We have experimented with SESM on a task of recognizing 25 words (city names) recorded from actual customers over the telephone network. Performance evaluation shows that our approach achieves a recognition accuracy over 93%, or about 99% at a rejection rate of 20%.
Bibliographic reference. Leung, Hong C. / Hetherington, I. Lee / Zue, Victor W. (1991): "Speech recognition using stochastic explicitsegment modeling", In EUROSPEECH1991, 931934.