Second Language Studies: Acquisition, Learning, Education and Technology

Tokyo, Japan
September 22-24, 2010

Pronunciation Proficiency Estimation Based on Multilayer Regression Analysis Using Speaker- Independent Structural Features

Masayuki Suzuki (1), Yu Qiao (2), Nobuaki Minematsu (1), Keikichi Hirose (1)

(1) The University of Tokyo, Japan
(2) Shenzhen Institutes of Advanced Technology, China

Teachers can assess the pronunciations of students independently of extra-linguistic features such as age and gender observed in the students’ utterances. This capacity is, however, difficult to realize on machines because linguistic differences and extra-linguistic differences change acoustic features commonly. Therefore, the performance of automatic pronunciation assessment is inevitably affected by the extra-linguistic features. Recently, we proposed acoustic features that are independent of extra-linguistic factors, called structural features and realized a technique for pronunciation proficiency estimation that is extremely robust to these factors. In this paper, we extend this technique with multilayer regression analysis, where supervised learning is done at each layer by using teachers’ scores of that layer. Experiments of estimating the proficiency show that higher correlations between teachers and machines are obtained compared to our previous structure-based assessment.

Full Paper

Bibliographic reference.  Suzuki, Masayuki / Qiao, Yu / Minematsu, Nobuaki / Hirose, Keikichi (2010): "Pronunciation proficiency estimation based on multilayer regression analysis using speaker- independent structural features", In L2WS-2010, paper O2-3.