13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?

Niko Moritz (1), Jörn Anemüller (1,2), Birger Kollmeier (1,2)

(1) Fraunhofer IDMT, Project group Hearing, Speech and Audio Technology, Oldenburg, Germany
(2) University of Oldenburg, Medical Physics Department, Oldenburg, Germany

Many research efforts in the field of feature extraction for automatic speech recognition are focused on analyzing slow amplitude fluctuations of speech. In this study the importance of spectral and temporal resolution for the amplitude modulation frequency analysis are investigated in order to provide guidance for the appropriate filter design. Therefore, different wavelet and Fourier transform like filter time scales are examined, i.e. the importance of time and frequency separation is compared. The results demonstrate that analyzing three separate amplitude modulation frequency bands of constant bandwidth that cover the range from about 2 to 16 Hz are sufficient for automatic speech recognition.

Index Terms: amplitude modulation, speech recognition, wavelet transform, feature extraction

Full Paper

Bibliographic reference.  Moritz, Niko / Anemüller, Jörn / Kollmeier, Birger (2012): "Amplitude modulation filters as feature sets for robust ASR: constant absolute or relative bandwidth?", In INTERSPEECH-2012, 1231-1234.