14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Improving Short Utterance Based I-Vector Speaker Recognition Using Source and Utterance-Duration Normalization Techniques

A. Kanagasundaram (1), D. Dean (1), Javier Gonzalez-Dominguez (2), S. Sridharan (1), D. Ramos (2), Joaquin Gonzalez-Rodriguez (2)

(1) Queensland University of Technology, Australia
(2) Universidad Autónoma de Madrid, Spain

A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance-duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations of SUN-LDA are proposed where normalization techniques are used to capture source variation from both short and full-length development i-vectors, one based upon pooling (SUNLDA- pooled) and the other on concatenation (SUN-LDA-concat) across the duration and source-dependent session variation. Both the SUN-LDA-pooled and SUN-LDA-concat techniques are shown to provide improvement over traditional LDA on NIST 08 truncated 10sec-10sec evaluation conditions, with the highest improvement obtained with the SUN-LDA-concat technique achieving a relative improvement of 8% in EER for mis-matched conditions and over 3% for matched conditions over traditional LDA approaches.

Full Paper

Bibliographic reference.  Kanagasundaram, A. / Dean, D. / Gonzalez-Dominguez, Javier / Sridharan, S. / Ramos, D. / Gonzalez-Rodriguez, Joaquin (2013): "Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques", In INTERSPEECH-2013, 2465-2469.