12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Boosting Speaker Recognition Performance with Compact Representations

Sibel Yaman, Jason Pelecanos, Mohamed Kamal Omar

IBM T.J. Watson Research Center, USA

This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a high-dimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker information and to de-emphasize noisy dimensions. Scores obtained from these representations are then combined with the scores obtained from high-dimensional supervector representations. The transformation parameters and the combination weights are estimated by minimizing a discriminative training objective function that approximates a minimum detection cost function. We carried out experiments on two NIST 2008 Speaker Recognition Evaluation English telephony tasks to compare the proposed approach with direct score combination obtained from low- and high-dimensional supervector representations. We have found that the proposed approach yields up to 18% relative gain.

Full Paper

Bibliographic reference.  Yaman, Sibel / Pelecanos, Jason / Omar, Mohamed Kamal (2011): "Boosting speaker recognition performance with compact representations", In INTERSPEECH-2011, 381-384.