Third ESCA/COCOSDA Workshop on Speech Synthesis

November 26-29, 1998
Jenolan Caves House, Blue Mountains, NSW, Australia

Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis

Daniel Hirst (1), Albert Rilliard (2), Véronique Aubergé (2)

(1) CNRS LPL, Université de Provence, Aix-en-Provence
(2) CNRS ICP, Université Stendhal, Grenoble 112

An experimental technique is described for eliciting a subjective evaluation of the prosody of synthetic speech by untrained listeners. The technique makes use of a graphic display time-aligned with the speech signal. Subjects are asked to indicate which parts of a recording are unsatisfactory by clicking on a computer screen with a mouse. The technique was applied to two TTS systems for French. Results obtained using this technique are to be compared with those obtained using an objective evaluation metric for prosodic characteristics, comparing the synthetic versions with a number of different readings by human speakers.


Full Paper

Bibliographic reference.  Hirst, Daniel / Rilliard, Albert / Aubergé, Véronique (1998): "Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis", In SSW3-1998, 1-4.