ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Articulatory features for speech-driven head motion synthesis

Atef Ben-Youssef, Hiroshi Shimodaira, David Adam Braude

This study investigates the use of articulatory features for speechdriven head motion synthesis as opposed to prosody features such as F0 and energy that have been mainly used in the literature. In the proposed approach, multi-stream HMMs are trained jointly on the synchronous streams of speech and head motion data. Articulatory features can be regarded as an intermediate parametrisation of speech that are expected to have a close link with head movement. Measured head and articulatory movements acquired by EMA were synchronously recorded with speech. Measured articulatory data was compared to those predicted from speech using an HMM-based inversion mapping system trained in a semi-supervised fashion. Canonical correlation analysis (CCA) on a data set of free speech of 12 people shows that the articulatory features are more correlated with head rotation than prosodic and/or cepstral speech features. It is also shown that the synthesised head motion using articulatory features gave higher correlations with the original head motion than when only prosodic features are used.

doi: 10.21437/Interspeech.2013-632

Cite as: Ben-Youssef, A., Shimodaira, H., Braude, D.A. (2013) Articulatory features for speech-driven head motion synthesis. Proc. Interspeech 2013, 2758-2762, doi: 10.21437/Interspeech.2013-632

  author={Atef Ben-Youssef and Hiroshi Shimodaira and David Adam Braude},
  title={{Articulatory features for speech-driven head motion synthesis}},
  booktitle={Proc. Interspeech 2013},