Both Hidden Markov Models and Neural Networks have already been used as production systems for speaker identification or verification. Recently  has shown that ergodic multi-state hidden Markov Models do not outperform one-state "hidden" Markov Models, i.e. Gaussian Mixture Models, for speaker recognition. She put in evidence that the important characteristic of these models is the total number of mixtures and not the number of states. These HMMs are thus unable to make use of temporal information for performing speaker recognition. On the other hand, recent experiments have shown that, for neural predictive systems, modelization of non stationarity allowed to significantly improve the performances . We are interested here in the development of such models which will be refereed to as multi-state predictive neural networks (MSPNNs). We study the ability of these systems for speaker identification and discuss the superiority of multi-state upon one-state models. We provide results on 15 talkers from the TIMIT database.
Bibliographic reference. Artieres, T. / Gallinari, Patrick (1995): "Multi-state predictive neural networks for text-independent speaker recognition", In EUROSPEECH-1995, 633-636.