Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Comparison of Time-Dependent Acoustic Features for a Speaker-Independent Speech Recognition System

D. Dubois

Centre National d'Etudes des Telecommunications, LAA/TSS/RCP, Lannion, France

Improvement of speech recognition performance was the goal to achieve with the experiments carried out in our laboratories. In this paper we shall attempt to show how we set out to maximize performance once the temporal variations of the coefficients were combined with the instantaneous vector as input to the HMM. Tests were conducted on several databases (digits, isolated commands of a vocal server, and 2-digit numbers), recorded over the telephone. Following which, the concatenation of several frames, linear data reduction analysis techniques and linear regression data were tested. Considerable improvement of the recognition performance was obtained by combining the first and second derivatives with the current frame over five adjacent frames. A recognition error rate of 0. 7% was obtained. Normally, a 2. 7%. error rate is observed using exclusively cepstrum coefficients. This resulted in a 70% error rate reduction. Keywords: HMM input coefficients, temporal variations, principal component analysis, discriminant analysis, linear regression.

Full Paper

Bibliographic reference.  Dubois, D. (1991): "Comparison of time-dependent acoustic features for a speaker-independent speech recognition system", In EUROSPEECH-1991, 935-938.