4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
While early machines adopted isolated syllable as input units and needed boring enrollment, our research focus on the speaker-independent, word-based dictation. A deliberately designed 120-speaker database was built for training ; inter-syllable context ,tonal and endpoint dependent acoustic model are applied with promising MFCC feature; Two-pass acoustic matching accelerates the recognition making fully advantage of the monosyllabic structure of Chinese speech; A complete word bigram and trigram serve as language processing module. With all efforts, the system reaches 90% character accuracy performing in almost real-time on Pentium PC without DSP help.
Bibliographic reference. Xu, Bo / Ma, Bing / Zhang, Shuwu / Qu, Fei / Huang, Taiyi (1996): "Speaker-independent dictation of Chinese speech with 32k vocabulary", In ICSLP-1996, 2320-2323.