Auditory-Visual Speech Processing 2007 (AVSP2007)
Kasteel Groenendaal, Hilvarenbeek, The Netherlands
Previous studies have revealed a temporal window during which human observers perceive physically desynchronized auditory and visual signals as synchronous. This study investigated effects of intermodal timing differences and speed differences on intelligibility of auditory-visual speech. We used 20 minimal pairs of Japanese four-mora words such as "mi-zu-a-ge" (catch landing) versus "mi-zu-a-me" (starch syrup) and administered intelligibility tests. Words were presented under visual-only, auditory-only, and auditoryvisual (AV) conditions. Two types of AV conditions were used: asynchronous and expansion conditions. In asynchronous (i.e. timing difference) conditions, the audio lag was 0-400 ms. In expansion (i.e. speed difference) conditions, the auditory signal was time-expanded while the visual signal was kept at the original speed. The amount of expansion was 0-400 ms. Results showed that the word intelligibility declined as the timing difference and speed difference increased. Results of AV benefit (i.e. the superiority of AV performance over auditory-only performance) revealed that the AV benefit at the end of words declined as the speed difference increased, although it did not decline as timing difference increased. These results suggest that intermodal lag recalibration requires a constant timing difference between auditory and visual signals. Older adults recalibrated neither the timing difference nor the speed difference. These results might be useful for design of a multimodal speech-rate conversion system.
Bibliographic reference. Tanaka, Akihiro / Sakamoto, Shuichi / Tsumura, Komi / Suzuki, Y˘iti (2007): "Effects of intermodal timing difference and speed difference on intelligibility of auditory-visual speech in younger and older adults", In AVSP-2007, paper P39.