We test the hypothesis that adding information regarding the positions of electromagnetic articulograph (EMA) sensors on the lips and jaw can improve the results of a typical acoustic-to-EMA mapping system, based on support vector regression, that targets the tongue sensors. Our initial motivation is to use such a system in the context of adding a tongue animation to a talking head built on the basis of concatenating bimodal acoustic-visual units. For completeness, we also train a system that maps only jaw and lip information to tongue information.
Bibliographic reference. Toutios, Asterios / Ouni, Slim (2011): "Predicting tongue positions from acoustics and facial features", In INTERSPEECH-2011, 2661-2664.