Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Rule-Based Visual Speech Synthesis

Jonas Beskow

Department of Speech Communication and Music Acoustics, KTH, Stockholm, Sweden

A system for rule based audiovisual text-to-speech synthesis has been created. The system is based on the KTH text-to-speech system which has been complemented with a three-dimensional parameterized model of a human face. The face can be animated in real time, synchronized with the auditory speech. The facial model is controlled by the same synthesis software as the auditory speech synthesizer. A set of rules that takes coarticulation into account has been developed. The audiovisual text-to-speech system has also been incorporated into a spoken man-machine dialogue system that is being developed at the department.

Full Paper

Bibliographic reference.  Beskow, Jonas (1995): "Rule-based visual speech synthesis", In EUROSPEECH-1995, 299-302.