Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Study of Time and Frequency Variability in Pathological Speech and Error Reduction Methods for Automatic Speech Recognition

Oscar Saz, Antonio Miguel, Eduardo Lleida, Alfonso Ortega, Luis Buera

Universidad de Zaragoza, Spain

In this work, we study the variations in the time and frequency domains inside a Spanish language corpus of speakers with non-pathological and pathological speech. We show how pathological speech has a greater variability in the duration of the words than non-pathological speech, while in the frequency domain we show that the vowels confusability increases by a 18%. The baseline experiments in Automatic Speech Recognition (ASR) with this corpus demonstrate that this variability causes a loss in the performance of ASR systems. To reduce the impact of time and frequency variability we use a recent Vocal Tract Length Normalization (VTLN) system: MATE (augMented stAte space acousTic modEl), as a way of improving the performance of ASR systems when dealing with speakers who suffer any kind of speech pathology. Experiments with MATE show a 17.04% and 11.19% WER reduction by using frequency and time MATE respectively.

Full Paper

Bibliographic reference.  Saz, Oscar / Miguel, Antonio / Lleida, Eduardo / Ortega, Alfonso / Buera, Luis (2006): "Study of time and frequency variability in pathological speech and error reduction methods for automatic speech recognition", In INTERSPEECH-2006, paper 1266-Tue1FoP.11.