First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Text-Independent Speaker Recognition Using Vocal Tract and Pitch Information

Tomoko Matsui, Sadaoki Furui

NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan

This paper proposes a new text-independent speaker recognition method based on vector quantization (VQ) using vocal tract and pitch information. The purpose of this research is to create a speaker recognition system robust against the temporal variations of feature parameters. This paper introduces several feature parameters related to both vocal tract and pitch information extracted from spoken vowels, words, and sentences. Interspeaker variability is enhanced, and intraspeaker variability is reduced, by using a new normalization method, Talker Variability Normalization (TVN). A new distance measure, the Distortion-Intersection Measure (DIM), is defined by the size and similarity of the intersection between test vectors and VQ codebook vectors. This proposed method, evaluated using a nine-talker database recorded over three years, achieves 99.0% speaker identification and 98.7% speaker verification accuracy.

Full Paper

Bibliographic reference.  Matsui, Tomoko / Furui, Sadaoki (1990): "Text-independent speaker recognition using vocal tract and pitch information", In ICSLP-1990, 137-140.