Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Speaker Cluster Based GMM Tokenization for Speaker Recognition

Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li

Institute for Infocomm Research, Singapore

We present a speaker recognition system with multiple GMM tokenizers as the front-end, and vector space modeling as the back-end classifier. GMM tokenizer captures the acoustic and phonetic characteristics of a speaker from the speech without the need of phonetic transcription. To enhance the speaker characteristics coverage and provide more discriminative information, a speaker clustering algorithm is proposed to build multiple GMM tokenizers that are arranged in parallel. For an input utterance, each of the tokenizers outputs a token sequence, which is then represented by a vector of n-gram probabilities. Multiple vectors are concatenated to form a composite vector. Finally the Support Vector Machine (SVM) is used as the back-end classifier of the composite vectors. We use the 2002 NIST Speaker Recognition Evaluation (SRE) corpus for training GMM tokenizers and background modeling, and evaluate on the 2001 NIST SRE corpus.

Full Paper

Bibliographic reference.  Ma, Bin / Zhu, Donglai / Tong, Rong / Li, Haizhou (2006): "Speaker cluster based GMM tokenization for speaker recognition", In INTERSPEECH-2006, paper 1429-Mon3A1O.4.