This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a high-dimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker information and to de-emphasize noisy dimensions. Scores obtained from these representations are then combined with the scores obtained from high-dimensional supervector representations. The transformation parameters and the combination weights are estimated by minimizing a discriminative training objective function that approximates a minimum detection cost function. We carried out experiments on two NIST 2008 Speaker Recognition Evaluation English telephony tasks to compare the proposed approach with direct score combination obtained from low- and high-dimensional supervector representations. We have found that the proposed approach yields up to 18% relative gain.
Bibliographic reference. Yaman, Sibel / Pelecanos, Jason / Omar, Mohamed Kamal (2011): "Boosting speaker recognition performance with compact representations", In INTERSPEECH-2011, 381-384.