An Investigation of Few-Shot Learning in Spoken Term Classification

Yangbin Chen, Tom Ko, Lifeng Shang, Xiao Chen, Xin Jiang, Qing Li


In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task. We formulate a user-defined scenario of spoken term classification as a few-shot learning problem. In most few-shot learning studies, it is assumed that all the N classes are new in a N-way problem. We suggest that this assumption can be relaxed and define a N+M-way problem where N and M are the number of new classes and fixed classes respectively. We propose a modification to the Model-Agnostic Meta-Learning (MAML) algorithm to solve the problem. Experiments on the Google Speech Commands dataset show that our approach1 outperforms the conventional supervised learning approach and the original MAML.


 DOI: 10.21437/Interspeech.2020-2568

Cite as: Chen, Y., Ko, T., Shang, L., Chen, X., Jiang, X., Li, Q. (2020) An Investigation of Few-Shot Learning in Spoken Term Classification. Proc. Interspeech 2020, 2582-2586, DOI: 10.21437/Interspeech.2020-2568.


@inproceedings{Chen2020,
  author={Yangbin Chen and Tom Ko and Lifeng Shang and Xiao Chen and Xin Jiang and Qing Li},
  title={{An Investigation of Few-Shot Learning in Spoken Term Classification}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2582--2586},
  doi={10.21437/Interspeech.2020-2568},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2568}
}