4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Frame Level Likelihood Normalization for Text-independent Speaker Identification using Gaussian Mixture Models

Konstantin P. Markov, Seiichi Nakagawa

Dept. of Information and Computer Sciences, Toyohashi Univ. of Tech., Aichi-ken, Japan

In this paper we propose a new speaker identification system, where the likelihood normalization technique, widely used for speaker verification, is introduced. In the new system, which is based on Gaussian Mixture Models, every frame of the test utterance is inputed to all the reference models in parallel. In this procedure, for each frame, likelihoods from all the models are available, hence they can be normalized at every frame. A special kind of likelihood normalization, called Weighting Models Rank, is also proposed. Experiments were performed using two databases - TIMIT and NTT. Evaluation results clearly show that frame level likelihood normalization technique is superior to the standard accumulated likelihood approach.

Full Paper

Bibliographic reference.  Markov, Konstantin P. / Nakagawa, Seiichi (1996): "Frame level likelihood normalization for text-independent speaker identification using Gaussian mixture models", In ICSLP-1996, 1764-1767.