First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Speech Recognition Based on the Integration Of FSVQ and Neural Network

Li-Qun Xu (1,2), Tie-Cheng Yu (1), G. D. Tattersall (2)

(1) Institute of Acoustics, Academia Sinica, Beijing, China
(2) School of Inform. Systems, Univ. of East Anglia, NR4, UK

In this paper we have developed a novel technique to deal with the problem of feeding a temporal variable speech signal to Multi-Layered Perceptron (MLP), which generally only accept fixed-dimension input pattern, for speech recognition. Instead of using conventional linear or nonlinear interpolation methods, this method is based on the integration of an MLP with Finite-State Vector Quantizer (FSVQ) which is characterized by the ability to memorize the correlations between successive speech feature vectors. FSVQ is designed to map the variable length input pattern into an activation trace on a set of sub-codebooks, each of which corresponds to a 'frame' input unit of the MLP. Experiments show that for the multi-speaker English Alphabet E-set task it can achieve better performance than an MLP with a non linearly interpolated input of fixed dimension.

Full Paper

Bibliographic reference.  Xu, Li-Qun / Yu, Tie-Cheng / Tattersall, G. D. (1990): "Speech recognition based on the integration of FSVQ and neural network", In ICSLP-1990, 689-692.