Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Cross-System Adaptation and Combination for Continuous Speech Recognition: The Influence of Phoneme Set and Acoustic Front-End

Sebastian Stüker (1), Christian Fügen (1), Susanne Burger (2), Matthias Wölfel (1)

(1) Universität Karlsruhe, Germany; (2) Carnegie Mellon University, USA

Cross-system adaptation and system combination methods, such as ROVER and confusion network combination, are known to lower the word error rate of speech recognition systems. They require the training of systems that are reasonably close in performance but at the same time produce output that differs in its errors. This provides complementary information which leads to performance improvements. In this paper we demonstrate the gains we have seen with cross-system adaptation and system combination on the English EPPS and RT0-05S lecture meeting task. We obtained the necessary varying systems by using different acoustic front-ends and phoneme sets on which our models are based. In a set of contrastive experiments we show the influence that the exchange of the components has on adaptation and system combination.

Full Paper

Bibliographic reference.  Stüker, Sebastian / Fügen, Christian / Burger, Susanne / Wölfel, Matthias (2006): "Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-end", In INTERSPEECH-2006, paper 1509-Mon3A2O.2.