Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Speaker Recognition Models

Kin Yu, John S. Mason, John Oglesby

Speech Research Group, Department of Electrical and Electronic Engineering, University of Wales, Swansea, UK

This paper evaluates continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, across incremental amounts of training data. In comparing VQ and CDHMMs for text-independent (TI) speaker recognition, it is shown that VQ performs better than an equivalent CDHMM with one training version, but is outperformed by the CDHMM when trained with ten training versions. In text-dependent (TD) experiments, a comparison of DTW, VQ and CDHMMs shows that DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data, the performance of each model is indistinguishable. Further analysis shows TD to be superior to TI architecture for speaker recognition, and TD digit performance illustrates zero, 1 and 9 to be good discriminators.

Full Paper

Bibliographic reference.  Yu, Kin / Mason, John S. / Oglesby, John (1995): "Speaker recognition models", In EUROSPEECH-1995, 629-632.