First International Conference on Spoken Language Processing (ICSLP 90)
A. Waibel introduced Time-Delay Neural Networks as a specific neural network architecture that is especially well adapted to the "dynamic nature of speech". We propose here to use low-dimensioned TDNNs for discriminating between phonetic features. We give evaluations of the different performances and we comment them. We also compare direct phoneme recognition scores using a sophisticated classical classifier on one hand, and a medium-size TDNN on the other hand. Extra results obtained after having split our corpus into vowels and consonants are also reported. Experiments are conducted on a set of 5270 phonemes extracted from natural continuous speech uttered by 1 male speaker. Nearly all scores on binary phonetic features range between 90 % and 99 %. More complex tasks provide results between 80 % and 90 %.
Bibliographic reference. Bimbot, Frédéric / Chollet, Gerard / Tubach, Jean-Pierre (1990): "Phonetic features extraction using time-delay neural networks", In ICSLP-1990, 665-668.