Iterative Compression of End-to-End ASR Model Using AutoML

Abhinav Mehrotra, Łukasz Dudziak, Jinsu Yeo, Young-yoon Lee, Ravichander Vipperla, Mohamed S. Abdelfattah, Sourav Bhattacharya, Samin Ishtiaq, Alberto Gil C.P. Ramos, SangJeong Lee, Daehyun Kim, Nicholas D. Lane

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7×, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5× compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

 DOI: 10.21437/Interspeech.2020-1894

Cite as: Mehrotra, A., Dudziak, Ł., Yeo, J., Lee, Y., Vipperla, R., Abdelfattah, M.S., Bhattacharya, S., Ishtiaq, S., Ramos, A.G.C., Lee, S., Kim, D., Lane, N.D. (2020) Iterative Compression of End-to-End ASR Model Using AutoML. Proc. Interspeech 2020, 3361-3365, DOI: 10.21437/Interspeech.2020-1894.

  author={Abhinav Mehrotra and Łukasz Dudziak and Jinsu Yeo and Young-yoon Lee and Ravichander Vipperla and Mohamed S. Abdelfattah and Sourav Bhattacharya and Samin Ishtiaq and Alberto Gil C.P. Ramos and SangJeong Lee and Daehyun Kim and Nicholas D. Lane},
  title={{Iterative Compression of End-to-End ASR Model Using AutoML}},
  booktitle={Proc. Interspeech 2020},