INTERSPEECH 2006 - ICSLP
The region-dependent transform (RDT) is a feature extraction method for speech recognition that employs the Minimum Phoneme Error (MPE) criterion to optimize a set of feature transforms, each concentrating on a region of the acoustic space. Previous results have shown that RDT gives significant recognition-error reduction in a large vocabulary speaker-independent (SI) system. As a follow-up investigation, this paper presents the recent progress of applying RDT in speaker-adaptive training (SAT). Similar to previous SI results, the integration of RDT with SAT yields 7% relative improvement in word error rate (WER). Also, theoretical comparisons are made between RDT and other discriminative feature extraction methods, including the improved version of the feature-space MPE (fMPE) that uses the "mean-offsets" as additional input features.
Bibliographic reference. Zhang, Bing / Matsoukas, Spyros / Schwartz, Richard (2006): "Recent progress on the discriminative region-dependent transform for speech feature extraction", In INTERSPEECH-2006, paper 1573-Wed1A2O.5.