International Conference on Auditory-Visual Speech Processing 2008

Tangalooma Wild Dolphin Resort, Moreton Island, Queensland, Australia
September 26-29, 2008

Parameterisation of 3D Speech Lip Movements

James D. Edge, Adrian Hilton, Philip J. B. Jackson

Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.

In this paper we describe a parameterisation of lip movements which maintains the dynamic structure inherent in the task of producing speech sounds. A stereo capture system is used to reconstruct 3D models of a speaker producing sentences from the TIMIT corpus. This data is mapped into a space which maintains the relationships between samples and their temporal derivatives. By incorporating dynamic information within the parameterisation of lip movements we can model the cyclical structure, as well as the causal nature of speech movements as described by an underlying visual speech manifold. It is believed that such a structure will be appropriate to various areas of speech modeling, in particular the synthesis of speech lip movements.

Full Paper

Bibliographic reference.  Edge, James D. / Hilton, Adrian / Jackson, Philip J. B. (2008): "Parameterisation of 3d speech lip movements", In AVSP-2008, 229-234.