12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Improving LVCSR System Combination Using Neural Network Language Model Cross Adaptation

X. Liu, M. J. F. Gales, P. C. Woodland

University of Cambridge, UK

State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple subsystems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. The standard approach involves only cross adapting acoustic models. To fully exploit the complimentary features among sub-systems, language model (LM) cross adaptation techniques can be used. Previous research on multi-level n-gram LM cross adaptation is extended to further include the cross adaptation of neural network LMs in this paper. Using this improved LM cross adaptation framework, significant error rate gains of 4.0%.7.1% relative were obtained over acoustic model only cross adaptation when combining a range of Chinese LVCSR sub-systems used in the 2010 and 2011 DARPA GALE evaluations.

Full Paper

Bibliographic reference.  Liu, X. / Gales, M. J. F. / Woodland, P. C. (2011): "Improving LVCSR system combination using neural network language model cross adaptation", In INTERSPEECH-2011, 2857-2860.