Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

A Multi-Pass Error Detection and Correction Framework for Mandarin LVCSR

Zhengyu Zhou, Helen M. Meng, Wai Kit Lo

Chinese University of Hong Kong, China

We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates a prototype of the multi-pass framework based on Mandarin dictation. In this prototype, the first pass recognizes speech with a well-trained state-of-the-art recognizer incorporating an efficient language model; the second pass detects recognition errors by a new three-step error detection procedure; and the third pass corrects errors detected in those lightly erroneous utterances by a novel error correction approach. The error correction algorithm corrects recognition errors by first creating candidate lists for errors, and then re-ranking the candidates with a combined model of mutual information and trigram. Mandarin dictation experiments show a relative reduction of 4% in character error rate (CER) over the initial recognition performance based on those light erroneous utterances detected.

Full Paper

Bibliographic reference.  Zhou, Zhengyu / Meng, Helen M. / Lo, Wai Kit (2006): "A multi-pass error detection and correction framework for Mandarin LVCSR", In INTERSPEECH-2006, paper 1947-Wed1CaP.12.