Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

Optical Cues to the Visual Perception of Lexical and Phrasal Stress in English

Rebecca Scarborough (1), Patricia Keating (2), Marco Baroni (3), Taehong Cho (4), Sven Mattys (5), Abeer Alwan (2), Edward Auer Jr (6), Lynne E. Bernstein (7)

(1) Stanford University, USA; (2) University of California, Los Angeles, USA; (3) University of Bologna, Italy, (4) Hanyang University, Korea, (5) University of Bristol, England; (6) University of Kansas, USA; (7) House Ear Institute, USA

In a study of optical cues to the visual perception of stress, three American English talkers spoke words that differed in lexical stress and sentences that differed in phrasal stress, while video and movements of the face were recorded. In a production analysis, stressed vs. unstressed syllables from these utterances were compared along many measures of facial movement, which were generally larger and faster under stress. In a visual perception experiment, 16 perceivers identified the location of stress in forced-choice judgments of video clips of these utterances (without audio). Phrasal stress (54% correct vs. 25% chance) was better-perceived than lexical stress (62% correct vs. 50% chance). The relation of the visual intelligibility of the prosody of these utterances to the optical characteristics of their production is discussed, with analysis of which cues are associated with successful visual perception.

Full Paper

Bibliographic reference.  Scarborough, Rebecca / Keating, Patricia / Baroni, Marco / Cho, Taehong / Mattys, Sven / Alwan, Abeer / Auer Jr, Edward / Bernstein, Lynne E. (2006): "Optical cues to the visual perception of lexical and phrasal stress in English", In SP-2006, paper 059.