Speech Prosody 2008
This paper investigates some non-F0 cues to emotional speech. Two speech samples were collected from spontaneous speech: the word "leave" - one sample spoken with emotion (sad) and the other, as not-emotional. Using the morphing algorithm of STRAIGHT , we morphed a series of 12 utterances, starting from the non-emotional "leave" to the emotional "leave", keeping F0 at 300 Hz. Perception test results show that the morphed speech sounds could be identified as sad, with stimulus 12 being heard as most emotional. The results of a simple correlation, together with a PCA analysis of listeners’ perceptual behavior, suggest that formant frequencies, specifically, lowering F2, F3, and F4 are important cues for perception of emotional (sad) speech.
Bibliographic reference. Erickson, Donna / Shochi, Takaaki / Menezes, Caroline / Kawahara, Hideki / Sakakibara, Ken-Ichi (2008): "Some non-F0 cues to emotional speech: an experiment with morphing", In SP-2008, 677-680.