13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Automating Crowd-supervised Learning for Spoken Language Systems

Ian McGraw, Scott Cyphers, Panupong Pasupat, Jingjing Liu, James Glass

MIT Computer Science & Artificial Intelligence Laboratory, Cambridge, MA, USA

Spoken language systems often rely on static speech recognizers. When the underlying models are dynamic, training is usually performed using unsupervised methods. In this work, we explore an alternative approach that uses human computation to provide on-the-fly crowd-supervised training. Although the framework we describe is applicable to any stochastic model for which the training data can be generated by nonexperts, we demonstrate its utility on the lexicon and language model of a speech recognizer in a cinema voice-search domain. We show how an initially shaky system can achieve over a 10#328% absolute improvement in word error rate (WER) - entirely without expert intervention. We then analyze how these gains were made.

Full Paper

Bibliographic reference.  McGraw, Ian / Cyphers, Scott / Pasupat, Panupong / Liu, Jingjing / Glass, James (2012): "Automating crowd-supervised learning for spoken language systems", In INTERSPEECH-2012, 2474-2477.