12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Combining Information Sources for Confidence Estimation with CRF Models

M. S. Seigel, P. C. Woodland

University of Cambridge, UK

Obtaining accurate confidence measures for automatic speech recognition (ASR) transcriptions is an important task which stands to benefit from the use of multiple information sources. This paper investigates the application of conditional random field (CRF) models as a principled technique for combining multiple features from such sources. A novel method for combining suitably defined features is presented, allowing for confidence annotation using lattice-based features of hypotheses other than the lattice 1-best. The resulting framework is applied to different stages of a state-of-the-art large vocabulary speech recognition pipeline, and consistent improvements are shown over a sophisticated baseline system.

Full Paper

Bibliographic reference.  Seigel, M. S. / Woodland, P. C. (2011): "Combining information sources for confidence estimation with CRF models", In INTERSPEECH-2011, 905-908.