EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

Minimum Classification Error Training for Speaker Identification Using Gaussian Mixture Models Based on Multi-Space Probability Distribution

Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura

Nagoya Institute of Technology, Japan

In our previous work, we have proposed a speaker modeling technique using spectral and pitch features for text-independent speaker identification based on Multi-Space Probability Distribution Gaussian Mixture Models (MSD-GMMs). We have presented a maximum likelihood (ML) estimation procedure for the MSD-GMM parameters and demonstrated its high recognition performance. In this paper, we describe an minimum classification error (MCE) training procedure for the MSD-GMM speaker models. MCE training is also applied to automatically estimate mixture-dependent stream weights for spectral and pitch streams. The MCE-based MSD-GMM speaker models are evaluated for a text-independent speaker identification task. Experimental results show that MCE training of the MSD-GMM parameters significantly reduces identification errors and system performance is further improved by appropriately weighting spectral and pitch streams using MCE training.

Full Paper

Bibliographic reference.  Miyajima, Chiyomi / Tokuda, Keiichi / Kitamura, Tadashi (2001): "Minimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution", In EUROSPEECH-2001, 2837-2840.