13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Language Modeling for Voice-Enabled Social TV Using Tweets

Junlan Feng, Bernard Renger

AT&T Labs-Research, Florham Park, NJ, USA

Social TV is a recent trend that integrates social media access and TV viewing. In this paper, we investigate approaches for building effective language models for a voice-enabled social TV application, where viewers can speak their social media updates while watching TV. We propose to take advantage of social media data, more specifically TV-related Twitter messages (tweets). The challenge is the noisy nature of Twitter data. Our contributions are as follows. First, we collect TV show related tweets and provide a detailed analysis of the style mismatch between written tweets and spoken language. Second, we propose a learning based approach to transforming tweets to be more suitable for language modeling. This transformation considers lexical, phonetic and contextual similarity between the misspellings and the canonical form. Third, we build the language models from normalized TV-related tweets along with other data resources that are weighted to optimize speech recognition performance. The model created via normalized tweets achieved higher performance.

Index Terms: Voice-Enabled Social TV, Text Normalization, Language Modeling

Full Paper

Bibliographic reference.  Feng, Junlan / Renger, Bernard (2012): "Language modeling for voice-enabled social TV using tweets", In INTERSPEECH-2012, 2350-2353.