EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


A Hybrid Approach to Enhance Task Portability of Acoustic Models in Chinese Speech Recognition

Jin-Song Zhang, Shu-Wu Zhang, Yoshinori Sagisaka, Satoshi Nakamura

ATR Spoken Language Translation Research Laboratories, Japan

This paper presents our approach to enhance the portability of acoustic models by mitigating the phonetic mismatch arising from a new testing task which is rather different from the training data. The approach is a hybrid one which combines knowledge-based context categorization to generate a context rich set of subword units, and data-driven-based acoustic model clustering on the level of context category. Compared with the conventional approach of only phonetic decision tree based model clustering and unseen model generation, the new approach improved greatly the desired subword coverage for the new testing domain, and achieved an error rate reduction by 10.8% for Chinese character accuracy in the recognition experiments. Together with the effect of the newly adopted basic units of 9 glottal stops, we achieved a total 23.5% error rate reduction in the testing compared to the baseline system.

Full Paper

Bibliographic reference.  Zhang, Jin-Song / Zhang, Shu-Wu / Sagisaka, Yoshinori / Nakamura, Satoshi (2001): "A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition", In EUROSPEECH-2001, 1661-1664.