Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages

Basil Abraham, Tejaswi Seeram, S. Umesh


Deep neural networks (DNN) require large amount of training data to build robust acoustic models for speech recognition tasks. Our work is intended in improving the low-resource language acoustic model to reach a performance comparable to that of a high-resource scenario with the help of data/model parameters from other high-resource languages. We explore transfer learning and distillation methods, where a complex high resource model guides or supervises the training of low resource model. The techniques include (i) multi-lingual framework of borrowing data from high-resource language while training the low-resource acoustic model. The KL divergence based constraints are added to make the model biased towards low-resource language, (ii) distilling knowledge from the complex high-resource model to improve the low-resource acoustic model. The experiments were performed on three Indian languages namely Hindi, Tamil and Kannada. All the techniques gave improved performance and the multi-lingual framework with KL divergence regularization giving the best results. In all the three languages a performance close to or better than high-resource scenario was obtained.


 DOI: 10.21437/Interspeech.2017-1009

Cite as: Abraham, B., Seeram, T., Umesh, S. (2017) Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages. Proc. Interspeech 2017, 2158-2162, DOI: 10.21437/Interspeech.2017-1009.


@inproceedings{Abraham2017,
  author={Basil Abraham and Tejaswi Seeram and S. Umesh},
  title={Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2158--2162},
  doi={10.21437/Interspeech.2017-1009},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1009}
}