5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Improved Feature Decorrelation for HMM-based Speech Recognition

Kris Demuynck (1), Jacques Duchateau (1), Dirk Van Compernolle (2), Patrick Wambacq (1)

(1) Katholieke Universiteit Leuven - ESAT, Belgium
(2) Lernout & Hauspie, Belgium

Many HMM-based recognition systems use mixtures of diagonal covariance gaussians to model the observation density functions in the states. These mixtures are however only approximations of the real distributions. One of the approximations is the assumption that the off-diagonal elements of the covariance matrices of the gaussians are close to zero (diagonal covariance). To that end, most recognition systems have some kind of parameter decorrelation near the end of the preprocessing, e.g. the inverse cosine transform used with cepstral transformations. These transforms are however not optimal if it comes to decorrelating features on the gaussian level. This paper presents an optimal solution in a least-square sense to the decorrelation problem. It also demonstrates the link between the recently published maximum likelihood modelling for semi-tied covariance matrices and the presented least-squares optimisation. Evaluation on a large vocabulary recognition task shows a 10% relative improvement.

Full Paper

Bibliographic reference.  Demuynck, Kris / Duchateau, Jacques / Compernolle, Dirk Van / Wambacq, Patrick (1998): "Improved feature decorrelation for HMM-based speech recognition", In ICSLP-1998, paper 1081.