Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

Effects of Prosodic Factors on Spectral Balance: Analysis and Synthesis

Qi Miao, Xiaochuan Niu, Esther Klabbers, Jan van Santen

Center for Spoken Language Understanding, OGI School of Science & Engineering, Oregon Health & Science University, Beaverton, OR, USA

In natural speech, prosodic factors such as accent, stress, phrasal position and speaking style play important roles in controlling several acoustic features, including segmental duration, pitch, and spectral balance, i.e., the amplitude pattern across different frequency ranges of the power spectrum. To synthesize speech that sounds natural, these effects need to be accurately modeled. In this study we describe and evaluate a synthesis method that mimics the effects of prosodic factors on spectral balance. We measure spectral balance by using the energy in four broad frequency bands that correspond to formant frequency ranges. An additive model is used to capture the effects of prosodic factors on spectral balance. A new sinusoidal synthesis module is implemented under Festival to predict the target spectral balance value for each band from analysis results and apply it to the amplitude parameters of the sinusoidal model during synthesis. In this study we evaluate an important strength of this system, which is its ability to reduce spectral discontinuities in unit concatenation.

Full Paper

Bibliographic reference.  Miao, Qi / Niu, Xiaochuan / Klabbers, Esther / Santen, Jan van (2006): "Effects of prosodic factors on spectral balance: analysis and synthesis", In SP-2006, paper 107.