International Conference on Auditory-Visual Speech Processing 2008
Tangalooma Wild Dolphin Resort,
Moreton Island, Queensland, Australia
The present work describes a new method for modeling 3-D motion capture data of speech movements. 27 markers on the face of a speaker uttering 27 VCVs were tracked by a Vicon motion capture system. The 3-D coordinates of the markers in all frames of the recording were modeled in four ways: 1) a plain PCA, 2) a guided PCA where each component is determined on a subset of markers that represent an articulator and the component is used to reconstruct the data by linear regression, 3) a cubic model where the components are determined by a PCA and the components are used to reconstruct the data by a polynomial of third order, and 4) a guided cubic model where each component is determined by a PCA on a subset of markers and the components are used to reconstruct the data by a polynomial of third order. Results show that the latter method - called guided non-linear model estimation (gnoME) with cubic regression leads to meaningful articulatory parameters like the guided PCA while the variance explanation equals that one of the plain PCA and the accuracy of the 3-D data reconstruction is higher compared to the plain PCA.
Bibliographic reference. Fagel, Sascha / Madany, Katja (2008): "Guided non-linear model estimation (gnoME)", In AVSP-2008, 59-62.