13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis

Taufiq Hasan, John H. L. Hansen

Center for Robust Speech Systems (CRSS), Eric Jonsson School of Engineering, University of Texas at Dallas, Richardson, TX, USA

State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/ utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, de-correlation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener gain on the acoustic feature eigenvector directions, and is similar to the signal sub-space based speech enhancement schemes. We also propose several methods of adaptively selecting the AFA parameter for each mixture. The proposed feature transform is applied using a probabilistic mixture alignment, and is integrated with a conventional i-Vector system. Experimental results on the telephone trials of the NIST SRE 2010 demonstrate the effectiveness of the proposed scheme.

Full Paper

Bibliographic reference.  Hasan, Taufiq / Hansen, John H. L. (2012): "Integrated feature normalization and enhancement for robust speaker recognition using acoustic factor analysis", In INTERSPEECH-2012, 1568-1571.