5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

A Bootstrap Training Approach for Language Model Classifiers

Volker Warnke, Elmar Nöth, Jan Buckow, Stefan Harbeck, Heinrich Niemann

University of Erlangen, Germany

In this paper, we present a bootstrap training approach for language model (LM) classifiers. Training class dependent LM and running them in parallel, LM can serve as classifiers with any kind of symbol sequence, e.g., word or phoneme sequences for tasks like topic spotting or language identification (LID). Irrespective of the special symbol sequence used for a LM classifier, the training of a LM is done with a manually labeled training set for each class obtained from not necessarily cooperative speakers. Therefore, we have to face some erroneous labels and deviations from the originally intended class specification. Both facts can worsen classification. It might therefore be better not to use all utterances for training but to automatically select those utterances that improve recognition accuracy; this can be done by a bootstrap procedure. We present the results achieved with our best approach on the VERBMOBIL-corpus for the tasks dialog act classification and LID.

Full Paper

Bibliographic reference.  Warnke, Volker / Nöth, Elmar / Buckow, Jan / Harbeck, Stefan / Niemann, Heinrich (1998): "A bootstrap training approach for language model classifiers", In ICSLP-1998, paper 0316.