Off-Topic Spoken Response Detection with Word Embeddings

Su-Youn Yoon, Chong Min Lee, Ikkyu Choi, Xinhao Wang, Matthew Mulholland, Keelan Evanini

In this study, we developed an automated off-topic response detection system as a supplementary module for an automated proficiency scoring system for non-native English speakers’ spontaneous speech. Given a spoken response, the system first generates an automated transcription using an ASR system trained on non-native speech, and then generates a set of features to assess similarity to the question. In contrast to previous studies which required a large set of training responses for each question, the proposed system only requires the question text, thus increasing the practical impact of the system, since new questions can be added to a test dynamically. However, questions are typically short and the traditional approach based on exact word matching does not perform well. In order to address this issue, a set of features based on neural embeddings and a convolutional neural network (CNN) were used. A system based on the combination of all features achieved an accuracy of 87% on a balanced dataset, which was substantially higher than the accuracy of a baseline system using question-based vector space models (49%). Additionally, this system almost reached the accuracy of vector space based model using a large set of responses to test questions (93%).

 DOI: 10.21437/Interspeech.2017-388

Cite as: Yoon, S., Lee, C.M., Choi, I., Wang, X., Mulholland, M., Evanini, K. (2017) Off-Topic Spoken Response Detection with Word Embeddings. Proc. Interspeech 2017, 2754-2758, DOI: 10.21437/Interspeech.2017-388.

  author={Su-Youn Yoon and Chong Min Lee and Ikkyu Choi and Xinhao Wang and Matthew Mulholland and Keelan Evanini},
  title={Off-Topic Spoken Response Detection with Word Embeddings},
  booktitle={Proc. Interspeech 2017},