This study presents the results of applying automated speech scoring technology to English spoken responses provided by non-native children in the context of an English proficiency assessment for middle school students. The assessment contains three diverse task types designed to measure a student's English communication skills, and an automated scoring system was used to extract features and build scoring models for each task. The results show that the automated scores have a correlation of r = 0.70 with human scores for the Read Aloud task, which matches the human-human agreement level. For the two tasks involving spontaneous speech, the automated scores obtain correlations of r = 0.62 and r = 0.63 with human scores, which represents a drop of 0.08.0.09 from the human-human agreement level. When all 5 scores from the assessment for a given student are aggregated, the automated speaker-level scores show a correlation of r = 0.78 with human scores, compared to a human-human correlation of r = 0.90. The challenges of using automated spoken language assessment for children are discussed, and directions for future improvements are proposed.
Bibliographic reference. Evanini, Keelan / Wang, Xinhao (2013): "Automated speech scoring for non-native middle school students with multiple task types", In INTERSPEECH-2013, 2435-2439.