EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Broadcast News LM Adaptation using Contemporary Texts

Marcello Federico, Nicola Bertoldi

ITC-Irst, Italy

This paper investigates the problem of dynamically updating the language model (LM) of a broadcast news speech recognition system, in order to cope with language and topic changes, typical of the news domain. Statistical adaptation methods are proposed that exploit written news sources which are daily available on the Internet, i.e. newswires and newspapers. Specifically, LM adaptation is performed by extending the basic lexicon, in order to minimize the out-of-vocabulary (OOV) rate, and by adapting the word probability distribution on the contemporary data. Experiments performed on 19 newscasts showed relative reductions of 58% on the OOV rate, 16% on the perplexity, and 4% on the word error rate.

Full Paper

Bibliographic reference.  Federico, Marcello / Bertoldi, Nicola (2001): "Broadcast news LM adaptation using contemporary texts", In EUROSPEECH-2001, 239-242.