13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Non-Uniform Filterbank for Speaker Recognition

Jia Min Karen Kua (1,2), Tharmarajah Thiruvaran (1), Eliathamby Ambikairajah (1,2)

(1) School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, NSW, Australia
(2) ATP Research Laboratory, National ICT Australia (NICTA), Eveleigh, NSW, Australia

It is known that speaker-specific information is distributed non-uniformly in the frequency domain. Current speaker recognition systems utilize auditory-motivated scales for extracting acoustic features. These scales, however, are not optimised to exploit the spectral distribution of speaker-specific information and hence may not be the optimal choice for speaker recognition. In this paper, we studied the distribution of speaker-specific information in Spectral Centroid Frequency feature, and a non-uniform filter bank is proposed to capture the speaker-specific information effectively. We used F-ratio and Kullback-Leibler (KL) distance to measure distribution of speaker-specific information and we empirically showed that KL distance is better than F-ratio in measuring discriminative ability. The proposed filterbank emphasises the high KL distance regions by allocating more filters in those regions. Experimental results showed a relative EER reduction of 8.8% over the Mel-scale filterbank on NIST2006 SRE database.

Index Terms: speaker recognition, F-ratio, Kullback-Leibler distance, Spectral centroid frequency

Full Paper

Bibliographic reference.  Kua, Jia Min Karen / Thiruvaran, Tharmarajah / Ambikairajah, Eliathamby (2012): "A non-uniform filterbank for speaker recognition", In INTERSPEECH-2012, 2274-2277.