Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Bellevue, WA, USA
June 27, 2011

Some Open Problems in Machine Learning for NLP

Mark Steedman

School of Informatics, University of Edinburgh, UK

Natural language processing is obstructed by two problems: that of ambiguity, and that of skewed distributions. Together they engender acute sparsity of data for supervised learning, both of grammars and parsing models.
   The paper expresses some pessimism about the prospects for getting around this problem using unsupervised methods, and considers the prospects for finding naturally labeled datasets to extend supervised methods.

Bibliographic reference.  Steedman, Mark (2011): "Some open problems in machine learning for NLP", In MLSLP-2011 (abstract).