Auditory-Visual Speech Processing (AVSP) 2011
In order to tele-operate the lip motion of a humanoid robot (such as android) from the utterances of the operator, we developed a speech-driven lip motion generation method. The proposed method is based on the rotation of the vowel space, given by the first and second formants, around the center vowel, and a mapping to the lip opening degrees. The method requires the calibration of only one parameter for speaker normalization, so that no other training of models is required. In a pilot experiment, the proposed audio-based method was perceived as more natural than vision-based approaches, regardless of the language.
Index Terms. lip motion, formant, humanoid robot, teleoperation, synchronization
"Geminoid" "audio" "mocap" "vision"
"Telenoid" "audio" "mocap" "vision"
Bibliographic reference. Ishi, Carlos T. / Liu, Chaoran / Ishiguro, Hiroshi / Hagita, Norihiro (2011): "Speech-driven lip motion generation for tele-operated humanoid robots", In AVSP-2011, 131-135.