International Conference on Auditory-Visual Speech Processing 2008

Tangalooma Wild Dolphin Resort, Moreton Island, Queensland, Australia
September 26-29, 2008

Text-to-AV Synthesis System for Thinking Head Project

Takaaki Kuratate

MARCS Auditory Laboratories, University of Western Sydney, Sydney, Australia

Here we introduce our new text-to-AV (speech and face animation) system created for our Thinking Head project that provides a modular research platform to the AV community. This includes a novel phone-to-face motion module capable of synthesizing face animation from triphone data. Using phoneme timing information from human speech and combining this with information derived from our speech face motion database built from motion capture data, we build correspondences between di- and tri-phones, and face motion. A comparison between face motion synthesized from speech using only our system and face motion generated from motion capture during speech verifies our capability to synthesize AV speech motion with equivalent quality as for motion-capturedriven speech face motion.

Full Paper

Bibliographic reference.  Kuratate, Takaaki (2008): "Text-to-AV synthesis system for Thinking Head Project", In AVSP-2008, 191-194.