4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
This paper describes a speaker recognition model using Two-Dimensional Mel-Cepstrum and predictive neural network. The speaker model consists of two networks. The first one is a self-organizing VQ map (Kohonen's feature map). The second part is the predictive network and learns transitional patterns on the feature map of each speaker's model. TDMC consists of averaged features and dynamic features of the two-dimensional mel-log spectra in the analyzed interval. The measure for speaker recognition is obtained by using a combination of the VQ distortion on the feature map and the prediction error on the predictive network. In the study, text-independent speaker identification experiments for 8 speakers were carried out. The experimental results have shown that a combination of a feature map and a predictive network is very effective, and that the proposed model using TDMC shows the robustness for time interval.
Bibliographic reference. Kitamura, Tadashi / Takei, Shinsai (1996): "Speaker recognition model using two-dimensional mel-cepstrum and predictive neural network", In ICSLP-1996, 1772-1775.