Improved Acoustic Modelling for Automatic Literacy Assessment of Children

Mauro Nicolao, Michiel Sanders, Thomas Hain

Automatic literacy assessment of children is a complex task that normally requires carefully annotated data. This paper focuses on a system for the assessment of reading skills, aiming to detection of a range of fluency and pronunciation errors. Naturally, reading is a prompted task and thereby the acquisition of training data for acoustic modelling should be straightforward. However, given the prominence of errors in the training set and the importance of labelling them in the transcription, a lightly supervised approach to acoustic modelling has better chances of success. A method based on weighted finite state transducers is proposed, to model specific prompt corrections, such as repetitions, substitutions and deletions, as observed in real recordings. Iterative cycles of lightly-supervised training are performed in which decoding improves the transcriptions and the derived models. Improvements are due to increasing accuracy in phone-to-sound alignment and in the training data selection. The effectiveness of the proposed methods for relabelling and acoustic modelling is assessed through experiemnts on the CHOREC corpus, in terms of sequence error rate and alignment accuracy. Improvements over the baseline of up to 60% and 23.3% respectively are observed.

 DOI: 10.21437/Interspeech.2018-2118

Cite as: Nicolao, M., Sanders, M., Hain, T. (2018) Improved Acoustic Modelling for Automatic Literacy Assessment of Children. Proc. Interspeech 2018, 1666-1670, DOI: 10.21437/Interspeech.2018-2118.

  author={Mauro Nicolao and Michiel Sanders and Thomas Hain},
  title={Improved Acoustic Modelling for Automatic Literacy Assessment of Children},
  booktitle={Proc. Interspeech 2018},