Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Speaker Recognition Using HMM Composition in Noisy Environments

Tomoko Matsui (1), Tomohito Kanno (2), Sadaoki Furui (1,2)

(1) NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan
(2) Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan

This paper investigates a speaker recognition method that is robust against background noise. In noisy environments, one important issue is how to create a model for each speaker so as to compensate for noise. The method described here is based on hidden Markov model (HMM) composition by the noise-and-voice (NOVO) transform. The HMM composition combines a speaker HMM and a noise-source HMM into a noise-added speaker HMM with a particular signal-to-noise ratio (SNR). Since it is difficult to measure the SNR exactly for non-stationary noise, this method creates several noise-added speaker HMMs with various SNRs. The HMM that has the highest likelihood value for the input speech is selected, and a speaker decision is made using this likelihood value. Experimental application of this method to text-independent speaker identification and verification in various kinds of noisy environments demonstrated considerable improvement in speaker recognition for speech utterances of male speakers.

Full Paper

Bibliographic reference.  Matsui, Tomoko / Kanno, Tomohito / Furui, Sadaoki (1995): "Speaker recognition using HMM composition in noisy environments", In EUROSPEECH-1995, 621-624.