5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Performance Evaluation of Word Phrase and Noun Category Language Models For Broadcast News Speech Recognition

Kazuyuki Takagi, Rei Oguro, Kenji Hashimoto, Kazuhiko Ozeki

The University of Electro-Communications, Japan

This paper reports our work to improve a bigram language model for Japanese TV broadcast news speech recognition. First, frequent word strings were grouped into phrases in order that the phrases were added to the lexicon as new units of recognition. The test set perplexity was improved when frequent function word strings were used as additional recognition units. The speech recognition performance was improved both by grouping function word strings and by grouping compound nouns that were selected by word association ratio. Secondly, in order to alleviate the OOV problem related with nouns, we built and tested a language model that allows switching its noun lexicon according to the domain of the article to be recognized next.

Full Paper

Bibliographic reference.  Takagi, Kazuyuki / Oguro, Rei / Hashimoto, Kenji / Ozeki, Kazuhiko (1998): "Performance evaluation of word phrase and noun category language models for broadcast news speech recognition", In ICSLP-1998, paper 0026.