EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Using Boosting and POS Word Graph Tagging to Improve Speech Recognition

Christer Samuelsson (1), James L. Hieronymus (2)

(1) Xerox Research Centre Europe, France
(2) Research Institute for Advanced Computer Science, USA

The word graphs produced by a large vocabulary speech recognition system usually contain a path labelled with the correct utterance, but this is not always the highest scoring path. Boosting increases the probability of words which occur often in the word graph, which are in some sense robust. Adding syntactic information allows rescoring of arc probabilities with the possibility that more grammarical word sequences will also be the correct ones. A theory is developed which allows general probabilistic syntactic models to be used to rescore word lattices. Experiments conducted on the Wall Street Journal (WSJ) corpus with a version of the AT&T 1995 FST LVSR system with part of speech (POS) trigram sequences show that using only POS leads to a loss in performance. Boosting alone provides an improvement in performance which is not statistically significant. Cascading the two methods, boosting first and then using syntactic information improves performance 4.5 % relative on a large portion of the 1995 DARPA test set.

Full Paper

Bibliographic reference.  Samuelsson, Christer / Hieronymus, James L. (2001): "Using boosting and POS word graph tagging to improve speech recognition", In EUROSPEECH-2001, 2143-2146.