5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Evaluation of Model Adaptation by HMM Decomposition on Telephone Speech Recognition

Tetsuya Takiguchi (1), Satoshi Nakamura (1), Kiyohiro Shikano (1), Masatoshi Morishima (2), Toshihiro Isobe (2)

(1) Nara Institute of Science and Technology, Japan
(2) Laboratory for Information Technology, NTT DATA Corporation, Japan

In this paper, we evaluate performance of model adaptation by the previously proposed HMM decomposition method on telephone speech recognition. The HMM decomposition method separates a composed HMM into a known phoneme HMM and an unknown noise and channel HMM by maximum likelihood (ML) estimation of the HMM parameters. A transfer function (telephone channel) HMM is estimated using adaptation speech data by applying the HMM decomposition twice in the linear spectral domain for noise and in the cepstral domain for channel. The telephone speech data for evaluation are recorded through 10 kinds of ordinary analog telephone handsets and cordless telephone handsets. The test results show that the average phrase accuracy with the clean speech HMMs is 60.9% for the ordinary analog telephone handsets, and 19.6% for the cordless telephone handsets. By the HMM decomposition method, the average phrase accuracy is improved to 78.1% for the ordinary analog telephone handsets, and 50.5% for the cordless telephone handsets.

Full Paper

Bibliographic reference.  Takiguchi, Tetsuya / Nakamura, Satoshi / Shikano, Kiyohiro / Morishima, Masatoshi / Isobe, Toshihiro (1998): "Evaluation of model adaptation by HMM decomposition on telephone speech recognition", In ICSLP-1998, paper 0698.