EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

Evaluation of an Automatically Obtained Shape and Appearance Model For Automatic Audio Visual Speech Recognition

Philippe Daubias, Paul Deleglise

LIUM, France

In this paper, we first present a shape and appearance model for Audio-Visual Automatic Speech Recognition. The shape model is a template (mean shape) and a set of deformation vectors to transform it into any possible shape. The global appearance model is a neural network trained to classify 5*5 colour image blocks as from skin, lips or inside of mouth. Both parts of this model were built automatically (without handlabelling). Appearance model was built using speech bimodality (acoustic information). We then propose several measures for quality evaluation of lip location. Finally, we show the classification results obtained using a hand-labelled and two automatically built appearance models of the lips.

Full Paper

Bibliographic reference.  Daubias, Philippe / Deleglise, Paul (2001): "Evaluation of an automatically obtained shape and appearance model for automatic audio visual speech recognition", In EUROSPEECH-2001, 1031-1034.