Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification

Rahim Saeidi (1), Antti Hurmalainen (2), Tuomas Virtanen (2), David A. van Leeuwen (1)

(1) Centre for Language and Speech Technology, Radboud University Nijmegen, The Netherlands
(2) Department of Signal Processing, Tampere University of Technology, Tampere, Finland

Probabilistic modeling is the most successful approach widely used in speaker recognition either for modeling the speakers in GMM-UBM structure or by serving as a prior in secondary-level feature extraction to form i-vectors. In this paper, we introduce exemplar-based sparse representation and sparse discrimination for closed-set speaker identification in a noisy living room from very short speech segments each of 2 seconds length on average. Large spectro-temporal contexts in mel-frequency band energy domain are used to build dictionary of all speakers and decomposing the observed noisy speech, the sparse activations are extracted as features for modeling stage. Sparse discriminant analysis is employed to learn sparse discriminative directions for classification stage. Experiments on the recently developed computational hearing in multi source environments (CHiME) corpus demonstrate excellent performance of the proposed approach specially in low-SNR. The speaker identification results are also reported for baseline text-independent GMM-UBM and text-dependent HMM.

Full Paper

Bibliographic reference.  Saeidi, Rahim / Hurmalainen, Antti / Virtanen, Tuomas / Leeuwen, David A. van (2012): "Exemplar-based sparse representation and sparse discrimination for noise robust speaker identification", In Odyssey-2012, 248-255.