5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

A Large-Vocabulary Taiwanese (MIN-NAN) Multi-Syllabic Word Recognition System Based Upon Right-Context-Dependent Phones with State Clustering by Acoustic Decision Tree

Ren-yuan Lyu (1), Yuang-jin Chiang (2), Wen-ping Hsieh (2)

(1) Chang Gung University, Taiwan
(2) National Tsing Hua University, Taiwan

In this paper, we apply context dependent phonetic modeling on the task of large vocabulary (with 20 thousand words) Taiwanese multi-syllabic word recognition. Considering the phonetic characteristics of Taiwanese, the right context dependent (RCD) phones instead of the general tri-phones are used. The RCDs are further clustered at the sub-phone or state level using a decision tree with a set of context-split questions specially designed for Taiwanese speech according to the acoustic/phonetic knowledge. For the speaker dependent case, 7.18% word error rate is achieved. A real-time prototype system implemented on a Pentium-II personal computer running MS-Windows95/ NT is also shown to validate the approaches proposed here.

Full Paper

Bibliographic reference.  Lyu, Ren-yuan / Chiang, Yuang-jin / Hsieh, Wen-ping (1998): "A large-vocabulary taiwanese (MIN-NAN) multi-syllabic word recognition system based upon right-context-dependent phones with state clustering by acoustic decision tree", In ICSLP-1998, paper 0080.