13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender

Felix Weninger, Erik Marchi, Björn Schuller

Institute for Human-Machine Communication, Technische Universität München, Germany

We address the iterative refinement of classifier decisions for recognition of intoxication, sleepiness, age and gender from speech. The nature of these tasks as being emedium-termf or elong-termf, as opposed to short-term states such as emotion, makes it possible to collect cumulative evidence in the form of utterance level decisions; we show that by fusing these decisions along the time axis, more and more reliable decisions can be gained. In extensive test runs on three official INTERSPEECH Challenge corpora, we show that the average recall can be improved by up to 5%, 6%, 10% and 11% absolute by longer-term observation of speaker sleepiness, gender, intoxication, and age, respectively, compared to the accuracy of a decision from a single utterance.

Full Paper

Bibliographic reference.  Weninger, Felix / Marchi, Erik / Schuller, Björn (2012): "Improving recognition of speaker states and traits by cumulative evidence: intoxication, sleepiness, age and gender", In INTERSPEECH-2012, 1159-1162.