Auditory-Visual Speech Processing (AVSP) 2011

Volterra, Italy
September 1-2, 2011

Speech-Driven Lip Motion Generation for Tele-Operated Humanoid Robots

Carlos T. Ishi (1), Chaoran Liu (1), Hiroshi Ishiguro (2), Norihiro Hagita (1)

(1) ATR Intelligent Robotics and Communication Labs.; (2) ATR Social Media Research Laboratory Group Hiroshi Ishiguro Laboratory; Kyoto, Japan

In order to tele-operate the lip motion of a humanoid robot (such as android) from the utterances of the operator, we developed a speech-driven lip motion generation method. The proposed method is based on the rotation of the vowel space, given by the first and second formants, around the center vowel, and a mapping to the lip opening degrees. The method requires the calibration of only one parameter for speaker normalization, so that no other training of models is required. In a pilot experiment, the proposed audio-based method was perceived as more natural than vision-based approaches, regardless of the language.

Index Terms. lip motion, formant, humanoid robot, teleoperation, synchronization

Bibliographic reference.  Ishi, Carlos T. / Liu, Chaoran / Ishiguro, Hiroshi / Hagita, Norihiro (2011): "Speech-driven lip motion generation for tele-operated humanoid robots", In AVSP-2011, 131-135.