5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Improving Performance on Switchboard by Combining Hybrid HME/HMM and Mixture of Gaussians Acoustic Models

Jürgen Fritsch, Michael Finke

Interactive Systems Laboratories, University of Karlsruhe, Germany Carnegie Mellon University, USA

This paper presents results of our efforts on combining standard mixture of Gaussians acoustic modeling [10] with a context-dependent hybrid connectionist HME/HMM architecture [3, 4] for the Switchboard corpus. Using a score normalization scheme which is independent of the stream's modeling paradigm and adaptive methods for combining multiple probability distributions, we achieve a relative decrease in word error rate of 3.5% and 9.3%, compared to each of the single stream systems. As opposed to multiple acoustic streams based on mixture of Gaussians, the integration of hybrid NN/HMM based modeling appears to be advantageous since the differences in modeling techniques and training algorithms allow to capture different aspects of the speech signal. Small dependence among emission probability estimates is considered essential for potential gains in interpolated systems.

Full Paper

Bibliographic reference.  Fritsch, Jürgen / Finke, Michael (1997): "Improving performance on switchboard by combining hybrid HME/HMM and mixture of Gaussians acoustic models", In EUROSPEECH-1997, 1963-1966.