Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

Adaptation Transforms of Auto-Associative Neural Networks as Features for Speaker Verification

Samuel Thomas (1), Sri Harish Mallidi (1), Sriram Ganapathy (1), Hynek Hermansky (1,2)

(1) Center for Language and Speech Processing, Department of Electrical and Computer Engineering; (2) Human Language Technology Center of Excellence
The Johns Hopkins University, Baltimore, USA.

We present a new approach of using Auto-Associative Neural Networks (AANNs) in the conventional GMM speaker verification framework with i-vector feature extraction and PLDA modeling. In this technique, an i-vector feature extractor is trained using adaptation parameters from a mixture of AANNs. In order to model parts of each speaker's acoustic space, a training objective function based on posterior probabilities of broad phonetic classes is used. The AANN based i-vectors are fused with GMM based i-vectors and a joint PLDA model is trained. The proposed approach provides promising results and significant gains when combined with baseline systems on the telephone conditions of NIST SRE 2010 and the recently concluded IARPA BEST 2011 speaker evaluations.

Full Paper

Bibliographic reference.  Thomas, Samuel / Mallidi, Sri Harish / Ganapathy, Sriram / Hermansky, Hynek (2012): "Adaptation transforms of auto-associative neural networks as features for speaker verification", In Odyssey-2012, 98-104.