Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Recent Progress on the Discriminative Region-Dependent Transform for Speech Feature Extraction

Bing Zhang, Spyros Matsoukas, Richard Schwartz

BBN Technologies, USA

The region-dependent transform (RDT) is a feature extraction method for speech recognition that employs the Minimum Phoneme Error (MPE) criterion to optimize a set of feature transforms, each concentrating on a region of the acoustic space. Previous results have shown that RDT gives significant recognition-error reduction in a large vocabulary speaker-independent (SI) system. As a follow-up investigation, this paper presents the recent progress of applying RDT in speaker-adaptive training (SAT). Similar to previous SI results, the integration of RDT with SAT yields 7% relative improvement in word error rate (WER). Also, theoretical comparisons are made between RDT and other discriminative feature extraction methods, including the improved version of the feature-space MPE (fMPE) that uses the "mean-offsets" as additional input features.

Full Paper

Bibliographic reference.  Zhang, Bing / Matsoukas, Spyros / Schwartz, Richard (2006): "Recent progress on the discriminative region-dependent transform for speech feature extraction", In INTERSPEECH-2006, paper 1573-Wed1A2O.5.