Sixth ISCA Workshop on Speech Synthesis

Bonn, Germany
August 22-24, 2007

Analysis of Affective Speech Recordings using the Superpositional Intonation Model

Esther Klabbers, Taniya Mishra, Jan P. H. van Santen

Center for Spoken Language Understanding, OGI School of Science & Engineering at OHSU, Beaverton, OR, USA

This paper presents an analysis of affective sentences spoken by a single speaker. The corpus was analyzed in terms of different acoustic and prosodic features, including features derived from the decomposition of pitch contours into phrase and accent curves. It was found that sentences spoken with a sad affect were most easily distinguishable from other affects as they were characterized by a lower F0, lower phrase and accent curves, lower overall energy and a higher spectral tilt. Fearful was also relatively easy to distinguish from angry and happy as it exhibited flatter phrase curves and lower accent curves. Angry and happy were more difficult to distinguish from each other, but angry was shown to exhibit a higher spectral tilt and a lower speaking rate. The analysis results provide informative clues for synthesizing affective speech using our proposed recombinant synthesis method.

