INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Temporal Relationship Between Auditory and Visual Prosodic Cues

Erin Cvejic, Jeesun Kim, Chris Davis

University of Western Sydney, Australia

It has been reported that non-articulatory visual cues to prosody tend to align with auditory cues, emphasizing auditory events that are in close alignment (visual alignment hypothesis). We investigated the temporal relationship between visual and auditory prosodic cues in a large corpus of utterances to determine the extent to which non-articulatory visual prosodic cues align with auditory ones. Six speakers saying 30 sentences in three prosodic conditions (#215;2 repetitions) were recorded in a dialogue exchange task, to measure how often eyebrow movements and rigid head tilts aligned with auditory prosodic cues, the temporal distribution of such movements, and the variation across prosodic conditions. The timing of brow raises and head tilts were not aligned with auditory cues, and the occurrence of visual cues was inconsistent, lending little support for the visual alignment hypothesis. Different types of visual cues may combine with auditory cues in different ways to signal prosody.

Full Paper

Bibliographic reference.  Cvejic, Erin / Kim, Jeesun / Davis, Chris (2011): "Temporal relationship between auditory and visual prosodic cues", In INTERSPEECH-2011, 981-984.