Auditory-Visual Speech Processing (AVSP) 2011

Volterra, Italy
September 1-2, 2011

Improving Naturalness of Visual Speech Synthesis

László Czap, János Mátyás

Department of Automation and Communication-Technology, University of Miskolc, Miskolc, Hungary

Facial animation has progressed significantly over the past few years and a variety of algorithms and techniques now make it possible to create highly realistic characters. Based on the author’s visual feature database for speechreading and the development of 3D modelling, a Hungarian talking head has been created. Our general approach is to use both static and dynamic observations of natural speech to guide the facial animation. A three level dominance model has been introduced that takes co-articulation into account. Each articulation feature has been grouped to dominant, flexible or uncertain classes. Analysis of the standard deviation and the trajectory of features served the evaluation process. The acoustic speech and the articulation are linked to each other by a synchronising process. Natural head movements, eyebrow rising, blinking and expressing emotions are demonstrated.

Index Terms. AV speech synthesis, improving naturalness, expressing emotions

Full Paper
Video Disgust (mpg)
Video Happiness (mpg)
Video Sadness (mpg)
Video Surprise (mpg)

Bibliographic reference.  Czap, László / Mátyás, János (2011): "Improving naturalness of visual speech synthesis", In AVSP-2011, 69.