EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Lip-Reading from Parametric Lip Contours for Audio- Visual Speech Recognition

Sabri Gurbuz, Eric K. Patterson, Zekeriya Tufekci, John N. Gowdy

Clemson University, USA

This paper describes the incorporation of a visual lip tracking and lipreading algorithm that utilizes the affine-invariant Fourier descriptors from parametric lip contours to improve the audio-visual speech recognition systems. The audio-visual speech recognition system presented here uses parallel hidden Markov models (HMMs), where a joint decision, using an optimal decision rule, is made after processing. This work describes the extraction of affine-invariant Fourier descriptors (AI-FDs) from parametric lip contour data. Finally, this work validates the use of optimal weight selection, which is based on the noise type and signal-to-noise ratio (SNR) for joint audio-visual automatic speech recognition (JAV-ASR).

Full Paper

Bibliographic reference.  Gurbuz, Sabri / Patterson, Eric K. / Tufekci, Zekeriya / Gowdy, John N. (2001): "Lip-reading from parametric lip contours for audio- visual speech recognition", In EUROSPEECH-2001, 1181-1184.