4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Multi-Modal Encoding of Speech in Memory: A First Report

David B. Pisoni, Helena M. Saldaña, Sonya M. Sheffert

Speech Research Laboratory, Indiana University, Bloomington, IN, USA

Why do people like to watch videos on TV? Why is there now increased interest in video telephones and multi-media technologies that were developed back in the 1960’s? Obviously, the availability of new digital technology has played an enormous role in this transition. But, we also believe this is in part due to the same operating principle that encourages listeners in noisy environments to orient toward a talker’s face. A multi-modal speech signal is extremely robust and informative and provides information that perceivers are able to exploit during perceptual analysis. In this paper, we present results from two experiments that examined performance in immediate memory and serial recall tasks with normal-hearing listeners using unimodal (auditory-only) and multi-modal (auditory+visual) presentation. Our findings suggest that the addition of visual information in the stimulus display about the speakers’ articulation affects the efficiency of initial encoding operations at the time of perception and also results in more detailed and robust representations of the stimulus events in memory. These results have implications for current theories of speech perception and spoken language processing.

Full Paper

Bibliographic reference.  Pisoni, David B. / Saldaña, Helena M. / Sheffert, Sonya M. (1996): "Multi-modal encoding of speech in memory: a first report", In ICSLP-1996, 1664-1667.