Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Neural Network Prediction of Lip Shape from Muscle EMG in Japanese Speech

Makoto Hirayama (1), Eric Vatikiotis-Bateson (2), Vincent Gracco (3), Mitsuo Kawato (2)

(1) Hewlett-Packard Laboratories Japan, Kanagawa, Japan
(2) ATR Human Information Processing Research Laboratories, Kyoto, Japan
(3) Haskins Laboratories, New Haven, Connecticut, USA

Lip movements during utterances of Japanese short sentences were predicted from orofacial muscle activity (EMG), using artificial neural network models. Inputs to the network model were EMG signals for six orofacial muscles. The lip movement parameters for output of the model were the horizontal distance between the corners of the mouth and the distance between the midsagittal lower lip and jaw markers. In addition to the relation between muscle EMG and articulator motion, the network learned the shape of a time-delay filter. Comparison of position, velocity, and acceleration prediction networks showed that the position prediction network performed best at recovering lip movement.

