September 22-25, 1997
This paper presents results of our efforts on combining standard mixture of Gaussians acoustic modeling  with a context-dependent hybrid connectionist HME/HMM architecture [3, 4] for the Switchboard corpus. Using a score normalization scheme which is independent of the stream's modeling paradigm and adaptive methods for combining multiple probability distributions, we achieve a relative decrease in word error rate of 3.5% and 9.3%, compared to each of the single stream systems. As opposed to multiple acoustic streams based on mixture of Gaussians, the integration of hybrid NN/HMM based modeling appears to be advantageous since the differences in modeling techniques and training algorithms allow to capture different aspects of the speech signal. Small dependence among emission probability estimates is considered essential for potential gains in interpolated systems.
Bibliographic reference. Fritsch, Jürgen / Finke, Michael (1997): "Improving performance on switchboard by combining hybrid HME/HMM and mixture of Gaussians acoustic models", In EUROSPEECH-1997, 1963-1966.