Auditory-Visual Speech Processing (AVSP) 2009

University of East Anglia, Norwich, UK
September 10-13, 2009

An Image-based Talking Head System

Kang Liu, Joern Ostermann

Institut für Informationsverarbeitung, Leibniz Universität Hannover, Germany

This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The analysis is to create a database containing a large number of mouth images and their associated facial and speech features. The synthesis is to generate realistic facial animations from phonetic transcripts of text. The facial animation is produced by selecting and concatenating appropriate mouth images that match the spoken words of the talking head. Subjective tests show that 60% of the animations are indistinguishable from real recordings.

Index Terms: talking head, unit selection, evaluation

