First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Vowel Recognition by Phoneme Filter Neural Networks

Masami Nakamura, Shinichi Tamura

ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

This paper describes a vowel filter neural network (PFN) approach to vowel recognition. Most conventional speech recognition neural networks have a serious drawback: the network output values do not correspond to candidate likelihoods. The PFN is a multi-layer neural network with fewer hidden units than input units prepared for each of the phoneme categories. Each network is trained as identity mapping by speech data belonging to one phoneme category. In the recognition process, the similarity between the input data and output data is computed for each network. The results of the experiment to apply the Japanese vowel recognition task showed that the PFN recognition rates for the top 2 or more choices are higher than those of a conventional 3-layer neural network. It was also confirmed that the PFN outputs represented candidate likelihoods.

Full Paper

Bibliographic reference.  Nakamura, Masami / Tamura, Shinichi (1990): "Vowel recognition by phoneme filter neural networks", In ICSLP-1990, 669-672.