Language Model Data Augmentation Based on Text Domain Transfer

Atsunori Ogawa, Naohiro Tawara, Marc Delcroix

To improve the performance of automatic speech recognition (ASR) for a specific domain, it is essential to train a language model (LM) using text data of the target domain. In this study, we propose a method to transfer the domain of a large amount of source data to the target domain and augment the data to train a target domain-specific LM. The proposed method consists of two steps, which use a bidirectional long short-term memory (BLSTM)-based word replacing model and a target domain-adapted LSTMLM, respectively. Based on the learned domain-specific wordings, the word replacing model converts a given source domain sentence to a confusion network (CN) that includes a variety of target domain candidate word sequences. Then, the LSTMLM selects a target domain sentence from the CN by evaluating its grammatical correctness based on decoding scores. In experiments using lecture and conversational speech corpora as the source and target domain data sets, we confirmed that the proposed LM data augmentation method improves the target conversational speech recognition performance of a hybrid ASR system using an n-gram LM and the performance of N-best rescoring using an LSTMLM.

 DOI: 10.21437/Interspeech.2020-1524

Cite as: Ogawa, A., Tawara, N., Delcroix, M. (2020) Language Model Data Augmentation Based on Text Domain Transfer. Proc. Interspeech 2020, 4926-4930, DOI: 10.21437/Interspeech.2020-1524.

  author={Atsunori Ogawa and Naohiro Tawara and Marc Delcroix},
  title={{Language Model Data Augmentation Based on Text Domain Transfer}},
  booktitle={Proc. Interspeech 2020},