This paper is concerned with a new ensemble merging algorithm for triphone Hidden Markov Model (HMM) state tying, that has been used in the AT&T spoken language recognizer for the official 1994 ARPA ATIS evaluation. In controlled experiments, we show that ensemble merging provides both more robust estimates of the triphone HMM's, and increased HMM resolution. In general, distribution tying is performed by sharing (merging) the data of the available state ensembles, to provide larger ensembles useful for likelihood function estimation. As an objective criterion for state tying, we define a total distortion function. Merging two or more ensembles never decreases the total distortion. To automatically tie acoustically similar states, we merge the ensembles that provide the smallest distortion increase. This straightforward technique has provided a relative word error rate reduction up to.
Bibliographic reference. Bocchieri, Enrico / Riccardi, Giuseppe (1995): "State tying of triphone HMM's for the 1994 AT&t ARPA ATIS recognizer", In EUROSPEECH-1995, 1499-1502.