Auditory-Visual Speech Processing (AVSP) 2011
This study examined the perception of linguistic prosody from augmented point-light displays that were derived from motion tracking six talkers producing different prosodic contrasts. In Experiment 1, we determined perceivers ability to use these abstract visual displays to match prosody across modalities (audio to video), when the non-matching visual display was segmentally identical and differed only in prosody. The results showed that perceivers were able to match the auditory speech to these limited face motion prosodic displays at better than chance levels; performance for the stimuli of different talkers varied greatly. A subjective perceptual rating task (Experiment 2) demonstrated that variation across talkers in the acoustic realization of prosodic contrasts may account for some of this difference; however a combination of the salience of acoustic and visual prosodic cues is likely to be driving matching performance.
Index Terms. visual prosody, focus, phrasing, point-light displays, cross-modal matching
Full Paper Video 1 (avi) Video 2 (wmv)
Bibliographic reference. Cvejic, Erin / Kim, Jeesun / Davis, Chris (2011): "Perceiving visual prosody from point-light displays", In AVSP-2011, 15-20.