Speech Prosody 2008

Campinas, Brazil
May 6-9, 2008

Voice Stress Extraction

Grazyna Demenko

Institute of Linguistic Adam Mickiewicz University, Poznań, Poland

The aim of the research was to assess the possibility of voice stress extraction and classification. It was assumed that the study’s results could be applied in call centers and could be useful for securi security services. The authentic Poznan police database with the recordings of the 997 emergency phone calls was used for analysis. Out of 60 000 recordings collected in the database, 20 000 were automatically selected, a few hundred of which were eventually chosen for acoustic evaluation, the basis for that selection being a perceptual assessment. The MDVP analysis confirmed statistical significance of such parameters as fundamental frequency, energy and pitch variations for stress categorization. Some segmenta segmental parameters such as tremor and noise parameters were also confirmed to be of some importance. In case of highly stressful conditions a systematic over over-one one- octave shift in pitch was observed. It was concluded that the range of F0 per se does not seem to correlate with stress whereas the shift in F0 register constitutes the primary indicator of stress. Linear Discriminant Analysis based on 12 acoustic features showed it is possible to categorize the following classes: neutral, depressive, stressed, highly stressed speech.

Full Paper

Bibliographic reference.  Demenko, Grazyna (2008): "Voice stress extraction", In SP-2008, 53-56.