Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Multi-State Predictive Neural Networks for Text-Independent Speaker Recognition

T. Artieres, Patrick Gallinari

LAFORIA UA CNRS 1095, University Paris, Paris, France

Both Hidden Markov Models and Neural Networks have already been used as production systems for speaker identification or verification. Recently [9] has shown that ergodic multi-state hidden Markov Models do not outperform one-state "hidden" Markov Models, i.e. Gaussian Mixture Models, for speaker recognition. She put in evidence that the important characteristic of these models is the total number of mixtures and not the number of states. These HMMs are thus unable to make use of temporal information for performing speaker recognition. On the other hand, recent experiments have shown that, for neural predictive systems, modelization of non stationarity allowed to significantly improve the performances [6]. We are interested here in the development of such models which will be refereed to as multi-state predictive neural networks (MSPNNs). We study the ability of these systems for speaker identification and discuss the superiority of multi-state upon one-state models. We provide results on 15 talkers from the TIMIT database.

Full Paper

Bibliographic reference.  Artieres, T. / Gallinari, Patrick (1995): "Multi-state predictive neural networks for text-independent speaker recognition", In EUROSPEECH-1995, 633-636.