First International Conference on Spoken Language Processing (ICSLP 90)
Many researchers achieved high phoneme recognition rates by multi-layered neural networks with Linear Discrimination Neural units (LDN). However, it is difficult to analyze the functions of those LDN networks. In this paper, we propose a multi-layered neural network with Elliptic Discrimination Neural units (EDN) in order to interpret the functions of each unit in the network more easily. The center of the elliptic discrimination boundary of a neural unit corresponds to the typical point in a input space. The radii of the ellipse correspond to the extent of the input space, hence it becomes clear which components of the input space are important to each unit in the EDN network. For comparison between EDN and LDN, we carried out the recognition experiments of phonemes /b,d,g/ in 5240 tokens of a Japanese speech database. The back-propagation learning procedure was used to train each network. In the experiments, we obtained recognition rates of EDN network as high as that of LDN network. We also confirmed which components of the input are important to each unit in the EDN network. Thus we found some significant regions for recognition on the input spectrogram. These regions are localized both by time and in frequency, thus the network is robust for time variations of input vectors.
Bibliographic reference. Kanedera, Noboru / Funada, Tetsuo (1990): "/b,d,g/ recognition with elliptic discrimination neural units", In ICSLP-1990, 1049-1052.