5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Improved Speaker Verification System With Limited Training Data On Telephone Quality Speech

Salleh Hussain (1,2), Fergus R. McInnes (1), Mervyn A. Jack (1)

(1) Centre for Communication Interface Research, University of Edinburgh, UK
(2) Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Malaysia

A hybrid neural network is proposed for speaker verification (SV). The basic idea in this system is the usage of vector quantization preprocessing as the feature extractor. The experiments were carried out using a neural network model(NNM) with frame labelling performed from a client codebook known as NNM-C. Improved performance for NNM-C with more inputs and proper alignment of the speech signals supports the hypothesis that a more detailed representation of the speech patterns proved helpful for the system. The flexibility of this system allows an equal error rate (EER) of 11.2% on a single isolated digit and 0.7% on a sequence of 12 isolated digits. This paper also compares neural network speaker verification system with the more conventional method like Hidden Markov models.

Full Paper

Bibliographic reference.  Hussain, Salleh / McInnes, Fergus R. / Jack, Mervyn A. (1997): "Improved speaker verification system with limited training data on telephone quality speech", In EUROSPEECH-1997, 835-838.