4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
Minimum Classification Error (MCE) has shown to be effective in improving the performance of a speaker identification system . However, there are still problems to solve, such as the variability of the voice characteristics of a particular speaker through time. In this work, we analyze the degradation of a GMM-based text-independent speaker identification system when using test data recorded over 6 months after the training session. And trying to avoid this degradation we study the use of supervised adaptation based on Maximum a Posteriori (MAP), and MCE. These techniques have been shown to provide good results for speaker adaptation in speech recognition. The major result we have obtained is that by starting with GMM models trained with only speech from session 1, similar identification results can be obtained for all the other sessions using an incremental adaptation using only 2.5 seconds of speech per speaker and session as data for the MCE training adaptation procedure. We have also found that, in our extreme experimental setup, MAP becomes unhelpful when combined with MCE adaptation.
Bibliographic reference. Martín del Alamo, Cesar / Alvarez, J. / Torre, C. de la / Poyatos, F. J. / Hernández, Lúis (1996): "Incremental speaker adaptation with minimum error discriminative training for speaker identification", In ICSLP-1996, 1760-1763.