Speech Prosody 2002

Aix-en-Provence, France
April 11-13, 2002

Multimodal Feedback Cues in Human-Machine Interactions

Björn Granström (1), David House (1), Marc Swerts (2)

[Names in alphabetical order]
(1) CTT, KTH, Stockholm, Sweden
(2) CNTS, Antwerp University, Belgium and TU/e, Eindhoven, The Netherlands

This paper reports on an experiment, whose goal it was to explore the relevance of both acoustic and visual cues for signaling ‘negative’ or ‘affirmative’ feedback in a conversation. Using the WaveSurfer software developed at CTT, the stimuli were created by orthogonally varying 6 parameters (4 visual and 2 acoustic ones), which always had two settings: one which was hypothesised to lead to affirmative feedback responses, and one which was hypothesised to lead to negative responses. Listeners were told that they were going to see and hear a series of exchanges between a talking head, representing a travel agent, and a human who wants to make a booking with the agent. They had to imagine that they were standing beside the human, and they were witnessing a fragment of a longer dialogue exchange. Their task was to rate this fragment in terms of whether the agent signals that he understands and accepts the human utterance, or whether the agent signals that he is uncertain about the human utterance. Results show that listeners are sensitive to both the visual and acoustic features when judging the utterances in terms of their function as feedback signals. Four of the six parameters had significant influence on the judgements, with Smile and F0 as the most prominent, followed by Eyebrow and Head_movement. Eye_closure and Delay contributed only marginally to the judgements but the tendency was in the expected direction.

Full Paper

Bibliographic reference.  Granström, Björn / House, David / Swerts, Marc (2002): "Multimodal feedback cues in human-machine interactions", In SP-2002, 347-350.