INTERSPEECH 2012

The aim of this study was to characterise, to model and to compare the different lingual articulatory strategies of a group of speakers. Individual principal component analysis (PCA) and multilinear decomposition methods have been applied to different representations of the tongue contour extracted from magnetic resonance images (MRI). The corpus consisted of seven speakers articulating 63 French vowels and consonants. On the average, over the seven speakers, the Root Mean Square prediction Error (RMSE) was 0.12 cm accounting for a percentage of variance explanation of 92.6% for the individual PCA, using 4 components. Several Multilinear decomposition methods, to model the tongue contour with a single set of components, have been performed and compared. The 2LevelPCA gave the best results among the other techniques. By means of a Student's ttest, at 5% of significance level, we found that 2levelPCA equals the PCA performance with 11 components to represent 91% of the variance explanation with a RMSE of 0.11 cm. While the same method, with 4 components, represents 75% of the variance explanation with a RMSE of 0.19 cm.
Index Terms: Articulatory modelling, speaker normalisation, factor analysis, MRI
Bibliographic reference. Vargas, Julián Andrés Valdés / Badin, Pierre / Lamalle, Laurent (2012): "Articulatory speaker normalisation based on MRIdata using threeway linear decomposition methods", In INTERSPEECH2012, 21862189.