Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Phoneme Recognition Using Recurrent Neural Networks

Yohji Fukuda, Haruya Matsumoto

Faculty of Engineering, Kobe University, Nada, Kobe, Japan

In this paper, we describe the result of the introduction of a prediction layer and a similarity index in the phoneme recognition experiments based on a recurrent neural network. The proposed network has the prediction layer and the recognition layer in the output layer. The prediction layer predicts a next input vector from the present input vectors, and the recognition layer classifies them. The purpose of the prediction layer is to transfer a contextual information to the network. The activation of recognition layer is multiplied by a cosine value of angle made between the predicted vector and the actual input vector every time. We call this cosine value the similarity index. When the predicted vector is different from the actual input vector, the output of recognition layer becomes smaller, because of the multiplication of the similarity index, so that we avoid an incorrect classification of the recognition layer. Keywords: phoneme recognition, recurrent neural network, prediction layer, similarity index, contextual information

Full Paper

Bibliographic reference.  Fukuda, Yohji / Matsumoto, Haruya (1991): "Phoneme recognition using recurrent neural networks", In EUROSPEECH-1991, 1419-1423.