Speech Prosody 2012

Shanghai, China
May 22-25, 2012

Exploiting Time and Frequency Domain Measures for Precise Voice Source Parameterisation

John Kane, Irena Yanushevskaya, Ailbhe Ní Chasaide, Christer Gobl

Phonetics and Speech Laboratory, Centre for Language and Communication Studies, Trinity College Dublin, Ireland

Much of our research has focused on the role of the voice source in the prosody of spoken language, including its linguistic and expressive dimensions. However, as automatic methods, both for deriving the voice source and for modelling it tend to lack robustness, we have generally conducted studies on small amounts of speech data. These studies have involved the use of labour intensive methods which require pulse-by-pulse manual fine-tuning. This paper describes a method to model the voice source automatically by taking into account some of the strategies involved in the manual fine-tuning approach. The method combines exhaustive search, dynamic programming and optimisation methods to overcome the known difficulties of standard automatic algorithms. A quantitative evaluation revealed parameter values for the proposed method that were closer to the reference values, than those obtained using a standard timebased method.

Index Terms: voice source, LF model, parameterisation, prosody, dynamic programming

Full Paper

Bibliographic reference.  Kane, John / Yanushevskaya, Irena / Ní Chasaide, Ailbhe / Gobl, Christer (2012): "Exploiting time and frequency domain measures for precise voice source parameterisation", In SP-2012, 143-146.