Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting

Kun Zhang, Zhiyong Wu, Daode Yuan, Jian Luan, Jia Jia, Helen Meng, Binheng Song


The training process of end-to-end keyword spotting (KWS) suffers from critical data imbalance problem that positive samples are far less than negative samples where different negative samples are not of equal importance. During decoding, false alarms are mainly caused by a small number of important negative samples having pronunciation similar to the keyword; however, the training loss is dominated by the majority of negative samples whose pronunciation is not related to the keyword, called unimportant negative samples. This inconsistency greatly degrades the performance of KWS and existing methods like focal loss don’t discriminate between the two kinds of negative samples. To deal with the problem, we propose a novel re-weighted interval loss to re-weight sample loss considering the performance of the classifier over local interval of negative utterance, which automatically down-weights the losses of unimportant negative samples and focuses training on important negative samples that are prone to produce false alarms during decoding. Evaluations on Hey Snips dataset demonstrate that our approach has yielded a superior performance over focal loss baseline with 34% (@0.5 false alarm per hour) relative reduction of false reject rate.


 DOI: 10.21437/Interspeech.2020-1644

Cite as: Zhang, K., Wu, Z., Yuan, D., Luan, J., Jia, J., Meng, H., Song, B. (2020) Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting. Proc. Interspeech 2020, 2567-2571, DOI: 10.21437/Interspeech.2020-1644.


@inproceedings{Zhang2020,
  author={Kun Zhang and Zhiyong Wu and Daode Yuan and Jian Luan and Jia Jia and Helen Meng and Binheng Song},
  title={{Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2567--2571},
  doi={10.21437/Interspeech.2020-1644},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1644}
}