EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Feature Extraction from Time-Frequency matrices for Robust Speech Recognition

Jose C. Segura (1), M. Carmen Benitez (2), Angel de la Torre (1), Antonio J. Rubio (1)

(1) Universidad de Granada, Spain
(2) Universidad de Granada, Spain / International Computer Science Institute, USA

In this paper we present a study about time-frequency distribution of acoustic-phonetic information for the Spanish language. This is based on a large Spanish database automatically labeled, and we conclude that results are similar to those obtained for hand-labeled English databases. We use bidimensional LDA to extract discriminant features in timefrequency domain (TF) that are more robust in noise than the standard ones based on MFCC and time derivatives. We show that TF domain and its corresponding transformed domain (CTM) are equivalent from the point of view of LDA analysis and use this fact to reduce the dimensionality of the problem. Finally, cascade unidimensional LDA (CLDA) is applied first in frequency and then in time. This gives better estimates of projection vectors and better recognition performance. The proposed techniques are evaluated in a connected digit recognition task. Utterances have been artificially corrupted with additive real noises.

Full Paper

Bibliographic reference.  Segura, Jose C. / Benitez, M. Carmen / Torre, Angel de la / Rubio, Antonio J. (2001): "Feature extraction from time-frequency matrices for robust speech recognition", In EUROSPEECH-2001, 1625-1628.