4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In recent speech recognition technology, the score of a hypothesis is often defined on the basis of HMM likelihood. As is well known, however, direct use of the likelihood as a scoring function causes difficult problems especially when the length of a speech segment varies depending on the hypothesis as in word-spotting, and some kind of normalization is indispensable. In this paper, a new method of likelihood normalization using an ergodic HMM is presented, and its performance is compared with those of conventional ones. The comparison is made from three points of view: recognition rate, word-end detection power, and the mean hypothesis length. It is concluded that the proposed method gives the best overall performance.
Bibliographic reference. Ozeki, Kazuhiko (1996): "Likelihood normalization using an ergodic HMM for continuous speech recognition", In ICSLP-1996, 2301-2304.