The feature set used with a classifier can have a large impact on classification performance. This paper presents a set of shrinkagebased features for Maximum Entropy and other classifiers in the exponential family. These features are inspired by the exponential class-based language model, Model M. We motivate the use of these features for the task of text classification and evaluate them on a natural language call routing task. The proposed features along with a new word clustering method result in significant improvements in action classification accuracy over typical word-based features, particularly for small amounts of training data.
Bibliographic reference. Sarikaya, Ruhi / Chen, Stanley F. / Ramabhadran, Bhuvana (2011): "Shrinkage-based features for natural language call routing", In INTERSPEECH-2011, 1309-1312.