Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Evaluation of Speaker-Independent Phoneme Recognition on TIMIT Database Using TDNNs

Nobuo Hataoka (1), Alex H. Waibel (2)

(1) Hitachi Dublin Laboratory, Trinity College, Dublin, Ireland
(2) School of Computer Science, Carnegie Mellon University, Pittsburgh, USA

This paper describes evaluation results and a new structure of Time-Delay Neural Networks (TDNN) for speaker-independent and context-independent phoneme recognition. The proposed new structure is based on the integration of TDNNs which have several TDNNs separated according to the duration of phonemes, so that it deals with phonemes of varying duration more effectively. In the experimental evaluation of the proposed new structure, 16-English vowel recognition was performed using 5268 vowel tokens picked from 480 sentences spoken by 140 speakers (98 males and 42 females) on the TIMIT (TI-MIT) database. A 60. 5% recognition rate, which was improved from 56% in the single TDNN structure, and stability improvement of recognition rate showed the effectiveness of the proposed integrated TDNNs.

Full Paper

Bibliographic reference.  Hataoka, Nobuo / Waibel, Alex H. (1991): "Evaluation of speaker-independent phoneme recognition on TIMIT database using TDNNs", In EUROSPEECH-1991, 105-108.