13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Improving L1-Specific Phonological Error Diagnosis in Computer Assisted Pronunciation Training

Theban Stanley, Kadri Hacioglu

Rosetta Stone Labs, Boulder, CO, USA

With the increasing use of technology in classrooms, computer assisted pronunciation training (CAPT) is becoming a vital tool in language learning. In this paper, we present a system that takes advantage of data from learners of a specific L1 to better model phonological errors at various levels in the system. At the lexical level, a statistical machine translation approach is used to model common phonological errors produced by a specific L1 population. At the acoustic level, L1-dependent maximum likelihood (ML) nonnative models and discriminative training are explored. In our experiments, use of a Korean language dependent nonnative lexicon gives us diagnostic abilities that did not exist in our baseline configuration. Replacing the native ML acoustic model with the L1-dependent nonnative model produces relative improvements of 27.37% in precision for phone detection/identification tasks. We also propose a constrained variant of minimum phone error (MPE) training which is better adapted to phone detection/diagnosis. This technique produces 5.6% relative improvement in precision in comparison to ML nonnative acoustic models.

Index Terms: language learning, phonological error modeling, machine translation, minimum phone error training

Full Paper

Bibliographic reference.  Stanley, Theban / Hacioglu, Kadri (2012): "Improving L1-specific phonological error diagnosis in computer assisted pronunciation training", In INTERSPEECH-2012, 827-830.