Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

A New Approach of Using Temporal Information in Mandarin Speech Recognition

Jyh-Her Yang (1), Yuan-Fu Liao (2), Yih-Ru Wang (1), Sin-Horng Chen (1)

(1) Dept. of Communication Engineering, National Chiao Tung University, Taiwan
(2) Department of Electronic Engineering, National Taipei University of Technology, Taiwan

In this paper, a new approach of using temporal information to assist in Mandarin speech recognition is discussed. It incorporates two types of temporal information into the recognition search. One is a statistical syllable duration model which considers the influences of 411 basesyllables, 5 tones, 4 position-in-word factors, and 3 positionin- sentence factors on syllable duration. Another is the timing information of modeling three types of inter-syllable boundary including intra-word, inter-word without punctuation mark (PM), and inter-word with PM. The uses of these two types of temporal information are expected to be useful for improving the segmentation accuracies in both acoustic decoding and linguistic decoding. Experimental results showed that the base-syllable/character/word recognition rates were slightly improved for both MATBN and Treebank datbase.

Full Paper

Bibliographic reference.  Yang, Jyh-Her / Liao, Yuan-Fu / Wang, Yih-Ru / Chen, Sin-Horng (2006): "A new approach of using temporal information in Mandarin speech recognition", In SP-2006, paper 213.