Speech Prosody 2006
The aim of the present paper is to demonstrate how prosody information could be used to recognize Mandarin Chinese fluent speech and what the recognized results imply. By applying our hierarchical prosody framework for fluent speech that specifies boundary breaks and boundary information across phrases and group phrases into speech paragraphs, we were able to develop software that automatically segment speech flow by boundary breaks and label the boundaries systematically. That is, the recognized results are identified speech paragraphs and various levels of prosodic units within each such paragraph. These recognized prosodic units are not unrelated speech units but rather, sister constituents that entail higher-up syntactic as well semantic relationships that cumulatively make up speech paragraphs in fluent continuous speech. Note how this top-down approach differs from most bottom-up approaches. The former offers information from higher up linguistic association whereas the latter treats identified Chinese syllables as discrete unrelated units or lexical words at most, leaving structural information that combines these syllables into linguistically significant units unaddressed. We believe using top-down prosody information may very well offer new breaking ground in fluent speech recognition.
Bibliographic reference. Tseng, Chiu-yu (2006): "Recognizing Mandarin Chinese fluent speech using prosody information: an initial investigation", In SP-2006, paper 008.