INTERSPEECH 2006 - ICSLP
Discriminative training, especially Minimum Verification Error( MVE)method plays an important role in the detection-based ASR. Recently, discriminative training also has been shown to be effective in large vocabulary continuous speech recognition . In this paper, we propose a rescoring framework to show the improvement by fusing MVE-trained detectors with a conventional recognizer. The recognizer performs regular Viterbi decoding, generating possible recognition candidates with corresponding likelihood in a fashion of either N-best lists or word graphs. Detectors trained under MVE criterion form and conduct hypothesis testing for all test tokens to accomplish additional scores. A number of linear or non-linear rescoring methods are then presented to combine these two groups of scores. The experiments were conducted on the TIMIT database, and the results indicates that combining based on word graphs outperforms the one on N-best lists in the final accuracy. This rescoring framework explores possible ways to combine other independent knowledge sources with a conventional recognizer. Further more, it can guide the future research of the pure detection-based ASR techniques.
Bibliographic reference. Fu, Qiang / Juang, Biing-Hwang (2006): "Investigation on rescoring using minimum verification error (MVE) detectors", In INTERSPEECH-2006, paper 1761-Mon3CaP.11.