INTERSPEECH 2006 - ICSLP
In this paper we investigate the use of articulatory data for speech recognition. Recordings of the articulatory movements originate from the MOCHA corpus, a database which contains speech, EGG, EMA and EPG recordings. It was found that in a Hidden Markov Model (HMM) based recognition framework careful processing of these signals can yield significantly better performance than that obtained by decoding of the acoustic signals. We present detailed results on the processing of the signals and the associated performance of monophone and triphone systems. Experimental evidence shows that acoustic-signal-to-word mappings and articulatory-signal-to-word mappings are equally complex. However, for the latter, evidence of short-comings of standard HMM based modelling is visible and should be addressed in future systems.
Bibliographic reference. Uraga, Esmeralda / Hain, Thomas (2006): "Automatic speech recognition experiments with articulatory data", In INTERSPEECH-2006, paper 1725-Mon2BuP.3.