Speech Prosody 2008

Campinas, Brazil
May 6-9, 2008

Prosody Variation: Application to Automatic Prosody Evaluation of Mandarin Speech

Huibin Jia (1), Jianhua Tao (1), Xia Wang (2)

(1) National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
(2) Nokia Research Centre, China

Prosody evaluation is an essential part of computer-aided language learning system. In the paper, prosodic variability among inter-speakers is investigated based on a database containing eight repetitions of 200 sentences. For Mandarin of reading style, its variability can be analyzed from rhythm, intonation and tone. Experimental results show that the mean correlation of tone between inter-speakers is 0.70, intonation and rhythm are 0.81. Based on these analyses, the prosodic similarity between the tested and standard utterances is calculated to automatically evaluate prosody quality. The standard utterances were recorded by multiple speakers, so they can cover different prosody patterns for the same utterance. The prosodic similarities are calculated from three aspects: tone, intonation and rhythm. Based on these similarities, the prosody quality can be graded. The method evaluated on the collected database has achieved good performance, and the correlation of human-machine scores is close to that of human-human scores.

Bibliographic reference.  Jia, Huibin / Tao, Jianhua / Wang, Xia (2008): "Prosody variation: application to automatic prosody evaluation of Mandarin speech", In SP-2008, 547-550.