4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

The Effect of Visual Information on Word Initial Consonant Perception of Dysarthric Speech

Richard P. Schumeyer, Kenneth E. Barner

Applied Science and Engineering Laboratories, University of Delaware/A.I. duPont Institute, Wilmington, DE, USA

Disabled individuals will realize many benefits from automatic speech recognition. To date, most automatic speech recognition research has focused on normal speech. However, many individuals with physical disabilities also exhibit speech disorders. While limited research has been conducted focusing on dysarthric speech recognition, the preliminary results indicate that additional study is necessary. Recently, increasing attention has been given to multimodal speech recognition schemes that utilize multiple input sources - most commonly audio and video. This multimodal approach has been applied to normal speech with demonstrated effectiveness. Through studying the effect of audio and visual information in a human perception experiment, this study attempts to discover whether such an approach would be useful for dysarthric speech recognition. Results of a closed vocabulary perception test are presented. In this test, 15 normal hearing viewers were presented with videotapes of three dysarthric speakers speaking a series of one syllable nonsense words. These words differed only in the initial consonant. The words were presented in both audio-only and audio-visual modes. Perception rates in both modes were measured. The results are analyzed and compared to other studies of visual speech perception and dysarthric speech articulation.

Full Paper

Bibliographic reference.  Schumeyer, Richard P. / Barner, Kenneth E. (1996): "The effect of visual information on word initial consonant perception of dysarthric speech", In ICSLP-1996, 46-49.