Total variability modeling, based on i-vector extraction of converting a variable-length sequence of feature vectors into a fixed-length i-vector, is currently an adopted parametrization technique for state of-the-art speaker verification systems. However, when the number of the feature vectors is low, uncertainty in the i-vector representation as a point estimate of the linear-Gaussian model is understandably problematic. It is known that the zeroth and first order sufficient statistics, given the hyperparameters, completely characterize the extracted i-vectors. In this study we propose to use a minimax strategy to estimate the sufficient statistics in order to increase the robustness of the extracted i-vectors. We show by experiments that the proposed minimax technique can improve over the baseline system from 9.89% to 7.99% on the NIST SRE 2010 8conv-10sec task.
Bibliographic reference. Hautamäki, Ville / Cheng, You-Chi / Rajan, Padmanabhan / Lee, Chin-Hui (2013): "Minimax i-vector extractor for short duration speaker verification", In INTERSPEECH-2013, 3708-3712.