Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

An Approach to Language Identification with Enhanced Language Model

Yonghong Yan, Etienne Barnard

Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology, Portland, OR,USA

An approach to Language Identification (LID) based on language-dependent phone recognition is presented. This LID system is designed to exploit varying phonotactic constraints of different languages. Based on the output of language-dependent phone recognizers, various LID features are extracted. Two methods are proposed to enhance the language modeling accuracy, (1) language models based on forward and backward bigrams, and (2) back-propagation based language model optimization. The system was evaluated on a standard 11-language task and a standard nine-language task. The results (correct rate) reached 87.6% for 45-second long utterances and 73.6% for 10-second long utterances for the 11-language task, and reached 87.8% and 74.0% respectively on the nine-language task. By adding channel normalization, the performance of our best systems was further improved to 90.8% and 77.1% for the 11-language task, and 91.1% and 77.5% on the nine-language task.

