Speech Prosody 2008

Campinas, Brazil
May 6-9, 2008

Using Prosody for Automatically Monitoring Human-Computer Call Dialogues

Woosung Kim

Convergys Corporation, Cincinnati, OH, USA

In human-computer call dialogues, human callers often get frustrated or angry due to, e.g., the computerís mistakes. Detecting such emotions would be beneficial for many purposes; nevertheless, emotion detection so far has been studied primarily as a classification task. Taking a step forward from classifying emotions from a single utterance, this paper investigates whether emotions, detected by prosodic features, can be used practically at the dialogue level, i.e., for monitoring human-computer dialogues to detect BAD calls requiring human agentís assistance. We first show emotion detection can be improved by a regression model. In combining emotion detection and dialogue monitoring, we demonstrate decision level fusion is better than feature level fusion. Our experiments also confirm that NEGATIVE emotions may be a sufficient, but not a necessary condition for detecting BAD calls. Finally, we show that BAD calls due to callerís NEGATIVE emotions may be identified by other clues.

Full Paper

Bibliographic reference.  Kim, Woosung (2008): "Using prosody for automatically monitoring human-computer call dialogues", In SP-2008, 79-82.