4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

A Comparison of Hybrid HMM Architectures Using Global Discriminative Training

Finn Tore Johansen

This paper presents a comparison of different model architectures for TIMIT phoneme recognition. The baseline is a conventional diagonal covariance Gaussian mixture HMM. This system is compared to two different hybrid MLP/HMMs, both adhering to the same restrictions regarding input context and output states as the Gaussian mixtures. All free parameters in the three systems are jointly optimised using the same global discriminative criterion. A Forward decoder, with total likelihood scoring, is used for recognition. While the global discriminative training method is found to improve the baseline HMM significantly, the differences between Gaussian and MLP-based architectures are small. The Gaussian mixture system however performs slightly better at the lowest complexity levels.

Full Paper

Bibliographic reference.  Johansen, Finn Tore (1996): "A comparison of hybrid HMM architectures using global discriminative training", In ICSLP-1996, 498-501.