ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Written-domain language modeling for automatic speech recognition

Haşim Sak, Yun-hsuan Sung, Françoise Beaufays, Cyril Allauzen

Language modeling for automatic speech recognition (ASR) systems has been traditionally in the verbal domain. In this paper, we present finite-state modeling techniques that we developed for language modeling in the written domain. The first technique we describe is for the verbalization of written-domain vocabulary items, which include lexical and non-lexical entities. The second technique is the decomposition-recomposition approach to address the out-of-vocabulary (OOV) and the data sparsity problems with non-lexical entities such as URLs, email addresses, phone numbers, and dollar amounts. We evaluate the proposed written-domain language modeling approaches on a very large vocabulary speech recognition system for English. We show that the written-domain language modeling improves the speech recognition and the ASR transcript rendering accuracy in the written domain over a baseline system using a verbal-domain language model. In addition, the written-domain system is much simpler since it does not require complex and error-prone text normalization and denormalization rules, which are generally required for verbal-domain language modeling.

doi: 10.21437/Interspeech.2013-192

Cite as: Sak, H., Sung, Y.-h., Beaufays, F., Allauzen, C. (2013) Written-domain language modeling for automatic speech recognition. Proc. Interspeech 2013, 675-679, doi: 10.21437/Interspeech.2013-192

  author={Haşim Sak and Yun-hsuan Sung and Françoise Beaufays and Cyril Allauzen},
  title={{Written-domain language modeling for automatic speech recognition}},
  booktitle={Proc. Interspeech 2013},