INTERSPEECH 2013

In language classification, measures like perplexity and Kullback Leibler divergence are used to compare language models. While this bears the advantage of isolating the effect of the language model in speech and language processing problems, the measures have no clear relation to the corresponding classification error. In practice, an improvement in terms of perplexity does not necessarily correspond to an improvement in the error rate. It is wellknown that Bayes decision rule is optimal if the true distribution is used for classification. Since the true distribution is unknown in practice, a model distribution is used instead, introducing suboptimality. We focus on the degradation introduced by a model distribution, and provide an upper bound on the error difference between Bayes decision and a modelbased decision rule in terms of the f Divergence between the true and model distributions. Simulations are first presented to reveal a special case of the bound, followed by an analytic proof of the generalized bound and its tightness. In addition, the conditions that result in the boundary cases will be discussed. Several instances of the bound will be verified using simulations, and the bound will be used to study the effect of the language model on the classification error.
Bibliographic reference. NussbaumThom, Markus / Beck, Eugen / Alkhouli, Tamer / Schlüter, Ralf / Ney, Hermann (2013): "Relative error bounds for statistical classifiers based on the fdivergence", In INTERSPEECH2013, 21972201.