5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Word and Acoustic Confidence Annotation for Large Vocabulary Speech Recognition

Lin Chase

The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA

We present improvements in confidence annotation of automatic speech recognizer output for large vocabulary, speaker- independent systems. Several strong additions to the set of predictor variables used for this purpose are discussed. Extensions which allow prediction of separate tvpes of errors, as opposed to the simple presence of an error, are presented. A new development, acoustic confidenceannotation, is explored, in which a predictor is built that indicates the likely successes and failures of the acoustic models alone. Four separate learning mechanisms are compared in terms of their ability to provide good confidence annotations from the same set of predictor variables. Performance figures are reported on both read news (the North American Business news corpus) and conversational telephone speech (the Switchboard corpus), both in American English. The Sphinx-II system [1] is used for the NAB tests. The Janus system [2J is used for the Switchboard tests.

Full Paper

Bibliographic reference.  Chase, Lin (1997): "Word and acoustic confidence annotation for large vocabulary speech recognition", In EUROSPEECH-1997, 815-818.