Second Language Studies: Acquisition, Learning, Education and Technology

Tokyo, Japan
September 22-24, 2010

Automatic Selection of Collocations for Instruction

Adam Skory, Maxine Eskenazi

Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA

For teaching of collocations no resource exists that comprehensively ranks collocations in terms of usefulness for learners. Towards developing a method to produce such a resource, we define a collocation's utility in terms of its unpredictability; the inability of a student to derive the meaning of the collocation from her semantic knowledge of its constituent words. We conduct an experiment comparing knowledge of phrasal verb collocations to familiarity with each collocation's verb constituent in order to have empirical measures of predictability. We then investigate corpus-based methods to approximate collocation predictability and find statistically significant correlations between a subset of these methods and the experimental data. This demonstrates that automated statistical approaches can significantly approximate the predictability of phrasal verbs according to our measures. We intend for this research to lead to development of resources for automated content selection in CALL.

