Speech Prosody 2002

Aix-en-Provence, France
April 11-13, 2002

Duration Models and the Perceptual Evaluation of Spoken Korean

Hyunsong Chung

Department of Computer Science, University College Dublin, Ireland

This paper builds predictive models of segment duration in context based on the CART models and "additive-multiplicative" models for Korean text-to-speech. It uses a corpus of 670 read sentences collected from one speaker of standard Korean. The best performance was obtained from a CART decision tree model, which shows that the correlation between the observed and the predicted durations is 0.77 and the mean squared error of prediction is 25.11 ms. Linguistic implications of these models are also discussed. The perceptual evaluations of these models are carried out using a Korean language diphone database based on the MBROLA synthesis system in order to investigate the clarity and the listener preference for durations.

Full Paper

Bibliographic reference.  Chung, Hyunsong (2002): "Duration models and the perceptual evaluation of spoken Korean", In SP-2002, 219-222.