Improvement of speech recognition performance was the goal to achieve with the experiments carried out in our laboratories. In this paper we shall attempt to show how we set out to maximize performance once the temporal variations of the coefficients were combined with the instantaneous vector as input to the HMM. Tests were conducted on several databases (digits, isolated commands of a vocal server, and 2-digit numbers), recorded over the telephone. Following which, the concatenation of several frames, linear data reduction analysis techniques and linear regression data were tested. Considerable improvement of the recognition performance was obtained by combining the first and second derivatives with the current frame over five adjacent frames. A recognition error rate of 0. 7% was obtained. Normally, a 2. 7%. error rate is observed using exclusively cepstrum coefficients. This resulted in a 70% error rate reduction. Keywords: HMM input coefficients, temporal variations, principal component analysis, discriminant analysis, linear regression.
Bibliographic reference. Dubois, D. (1991): "Comparison of time-dependent acoustic features for a speaker-independent speech recognition system", In EUROSPEECH-1991, 935-938.