Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

New Resources for Recognition of Confusable Linguistic Varieties: The LRE11 Corpus

Stephanie Strassel, Kevin Walker, Karen Jones, Dave Graff, Christopher Cieri

Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA

The NIST 2011 Language Recognition Evaluation focuses on language pair discrimination for 24 languages/dialects, some of which may be considered mutually intelligible or closely related. The LRE11 evaluation required new data for all languages, comprising both conversational telephone speech and broadcast narrowband speech from multiple sources in each language. Given the potential confusion among varieties in the collection, manual language auditing required special care including the assessment of inter-auditor consistency. We report on collection methods, auditing approaches, and results.

Full Paper

Bibliographic reference.  Strassel, Stephanie / Walker, Kevin / Jones, Karen / Graff, Dave / Cieri, Christopher (2012): "New resources for recognition of confusable linguistic varieties: the LRE11 corpus", In Odyssey-2012, 202-208.