EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

Background Learning of Speaker Voices for Textindependent Speaker Identification

Wei-Ho Tsai (1), Y. C. Chu (2), Chao-Shih Huang (2), Wen-Whei Chang (3)

(1) Philips Research East Asia-Taipei / National Chiao Tung University,Taiwan, ROC
(2) Philips Research East Asia-Taipei, Taiwan, ROC
(3) National Chiao Tung University, Taiwan, ROC

This study provides a novel learning mechanism, the so-called background learning, to the problem of text-independent speaker identification (speaker ID). Unlike the conventional speaker ID, the proposed system does not rely on enrollment data of clients in construction of speaker-specific models, but instead attempts to learn speakers' voices via clustering and parametric modeling of off-line collected data with no label of speaker identity. This eliminates the necessity of enrolling a large amount of speech data from clients. To permit such unsupervised learning, an efficient algorithm for blind clustering of speech utterances based on speaker characteristics is developed. Experimental results demonstrated that when very limited enrollment data is available, the speaker-ID performance achieved with the background learning could emulate that of using abundant enrollment data.

Full Paper

Bibliographic reference.  Tsai, Wei-Ho / Chu, Y. C. / Huang, Chao-Shih / Chang, Wen-Whei (2001): "Background learning of speaker voices for textindependent speaker identification", In EUROSPEECH-2001, 767-771.