Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Speaker Recognition Experiments in Estonian Using Multi-Layer Feed-Forward Neural Nets

Toomas Altosaar (1), Einar Meister (2)

(1) Acoustics Laboratory, Helsinki University of Technology, Espoo, Finland
(2) Laboratory of Phonetics and Speech Technology, Institute of Cybernetics, Estonian Academy of Sciences, Tallinn, Estonia

In this paper a general strategy towards robust and efficient speaker recognition is presented. Emphasis is placed on comparing the usefulness of different features calculated from the speech signal at different temporal and spectral resolutions. Specifically, three spectral features are evaluated in a neural network environment: linear frequency loudness scaled spectra, auditory spectra from an auditory model, and the lattice coefficients from a warped linear predictor. These features are tested with four different neural network topologies ranging from speaker identification to verification configurations. Variations in the neural net dimensions are also performed to gain an understanding of the complexity of the problem. The tests are based on 40 minutes of speech recorded from a set of 20 native Estonian speakers.

Full Paper

Bibliographic reference.  Altosaar, Toomas / Meister, Einar (1995): "Speaker recognition experiments in Estonian using multi-layer feed-forward neural nets", In EUROSPEECH-1995, 333-336.