Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Automatic Speech Recognition Experiments with Articulatory Data

Esmeralda Uraga, Thomas Hain

University of Sheffield, UK

In this paper we investigate the use of articulatory data for speech recognition. Recordings of the articulatory movements originate from the MOCHA corpus, a database which contains speech, EGG, EMA and EPG recordings. It was found that in a Hidden Markov Model (HMM) based recognition framework careful processing of these signals can yield significantly better performance than that obtained by decoding of the acoustic signals. We present detailed results on the processing of the signals and the associated performance of monophone and triphone systems. Experimental evidence shows that acoustic-signal-to-word mappings and articulatory-signal-to-word mappings are equally complex. However, for the latter, evidence of short-comings of standard HMM based modelling is visible and should be addressed in future systems.

Full Paper

Bibliographic reference.  Uraga, Esmeralda / Hain, Thomas (2006): "Automatic speech recognition experiments with articulatory data", In INTERSPEECH-2006, paper 1725-Mon2BuP.3.