13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Maximum F1-Score Discriminative Training for Automatic Mispronunciation Detection in Computer-Assisted Language Learning

Hao Huang (1), Jianming Wang (1), Halidan Abudureyimu (2)

(1) Department of Information Science and Engineering; (2) Department of Electrical Engineering;
Xinjiang University, Urumqi, China

In this paper, we propose and evaluate a novel discriminative training criterion for hidden Markov model (HMM) based automatic mispronunciation detection in computer-assisted pronunciation training. The objective function is formulated as a smooth form of the F1- score on the annotated non-native speech database. The objective function maximization is achieved by using extended Baum Welch form like HMM updating equations based on the weak-sense auxiliary function method. Simultaneous updating of acoustic model and phone threshold parameters is proposed to ensure objective improvement. Mispronunciation detection experiments have shown the method is effective in increasing the F1-score, Precision, Recall and detection accuracy on both the training data and evaluation data.

Index Terms: automatic mispronunciation detection, F1-score, discriminative training, computer-assisted language learning

Full Paper

Bibliographic reference.  Huang, Hao / Wang, Jianming / Abudureyimu, Halidan (2012): "Maximum F1-score discriminative training for automatic mispronunciation detection in computer-assisted language learning", In INTERSPEECH-2012, 815-818.