14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

R-Norm: Improving Inter-Speaker Variability Modelling at the Score Level via Regression Score Normalisation

David Vandyke, Michael Wagner, Roland Goecke

University of Canberra, Australia

This paper presents a new method of score post-processing which utilises previously hidden relationships among client models and test probes that are found within the scores produced by an automatic speaker recognition system. We suggest the name r-Norm (for Regression Normalisation) for the method, which can be viewed as both a score normalisation process and as a novel and improved modelling technique of inter-speaker variability. The key component of the method lies in learning a regression model between development data scores and an eidealf score matrix, which can either be derived from clean data or created synthetically. To generate scores for experimental validation of the proposed idea we perform a classic GMM-UBM experiment employing mel-cepstral features on the 1sp-female task of the NIST 2003 SRE corpus. Comparisons of the r-Norm results are made with standard score postprocessing/ normalisation methods t-Norm and z-Norm. The r - Norm method is shown to perform very strongly, improving the EER from 18.5% to 7.01%, significantly outperforming both z-Norm and t-Norm in this case. The baseline system performance was deemed acceptable for the aims of this experiment, which were focused on evaluating and comparing the performance of the proposed r-Norm idea.

Full Paper

Bibliographic reference.  Vandyke, David / Wagner, Michael / Goecke, Roland (2013): "R-norm: improving inter-speaker variability modelling at the score level via regression score normalisation", In INTERSPEECH-2013, 3117-3121.