4^{th} International Conference on Spoken Language ProcessingPhiladelphia, PA, USA |
In this paper we propose a new speaker identification system, where the likelihood normalization technique, widely used for speaker verification, is introduced. In the new system, which is based on Gaussian Mixture Models, every frame of the test utterance is inputed to all the reference models in parallel. In this procedure, for each frame, likelihoods from all the models are available, hence they can be normalized at every frame. A special kind of likelihood normalization, called Weighting Models Rank, is also proposed. Experiments were performed using two databases - TIMIT and NTT. Evaluation results clearly show that frame level likelihood normalization technique is superior to the standard accumulated likelihood approach.
Bibliographic reference. Markov, Konstantin P. / Nakagawa, Seiichi (1996): "Frame level likelihood normalization for text-independent speaker identification using Gaussian mixture models", In ICSLP-1996, 1764-1767.