13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech

Mauro Nicolao (1), Javier Latorre (2), Roger K. Moore (1)

(1) Speech and Hearing Group, Dept. Computer Science, University of Sheffield, UK
(2) Toshiba Research Europe Ltd., Cambridge Research Laboratory, UK

This paper presents a computational model of human speech production based on the hypothesis that low energy attractors for a human speech production system can be identified, and that interpolation/extrapolation along the key dimension of hypo/hyper-articulation can be motivated by energetic considerations of phonetic contrast. An HMM-based speech synthesiser along with continuous adaptation of its statistical models was used to implement the model. Two adaptation methods were proposed for vowel and consonant models and their effectiveness was tested by showing that such hypo/hyper-articulation control can manipulate successfully the intelligibility of synthetic speech in noise. Objective evaluations with the ANSI Speech Intelligibility Index indicate that intelligibility in various types of noise is effectively controlled. In particular, in the hyper-articulation transforms, the improvement with respect to unadapted speech is above 25 %.

Index Terms: reactive speech synthesis, hypo/hyper-articulated speech, intelligibility enhancement

Full Paper

Bibliographic reference.  Nicolao, Mauro / Latorre, Javier / Moore, Roger K. (2012): "C2h: a computational model of H&h-based phonetic contrast in synthetic speech", In INTERSPEECH-2012, 987-990.