First International Conference on Spoken Language Processing (ICSLP 90)
This paper describes a vowel filter neural network (PFN) approach to vowel recognition. Most conventional speech recognition neural networks have a serious drawback: the network output values do not correspond to candidate likelihoods. The PFN is a multi-layer neural network with fewer hidden units than input units prepared for each of the phoneme categories. Each network is trained as identity mapping by speech data belonging to one phoneme category. In the recognition process, the similarity between the input data and output data is computed for each network. The results of the experiment to apply the Japanese vowel recognition task showed that the PFN recognition rates for the top 2 or more choices are higher than those of a conventional 3-layer neural network. It was also confirmed that the PFN outputs represented candidate likelihoods.
Bibliographic reference. Nakamura, Masami / Tamura, Shinichi (1990): "Vowel recognition by phoneme filter neural networks", In ICSLP-1990, 669-672.