12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Language Model Expansion Using Webdata for Spoken Document Retrieval

Ryo Masumura, Seongjun Hahm, Akinori Ito

Tohoku University, Japan

In recent years, there has been increasing demand for ad hoc retrieval of spoken documents. We can use existing text retrieval methods by transcribing spoken documents into text data using a Large Vocabulary Continuous Speech Recognizer (LVCSR). However, retrieval performance is severely deteriorated by recognition errors and out-of-vocabulary (OOV) words. To solve these problems, we previously proposed an expansion method that compensates the transcription by using text data downloaded from the Web. In this paper, we introduce two improvements to the existing document expansion framework. First, we use a large-scale sample database of webdata as the source of relevant documents, thus avoiding the bias introduced by choosing keywords in the existing methods. Next, we use a document retrieval method based on a statistical language model (SLM), which is a popular framework in information retrieval, and also propose a new smoothing method considering recognition errors and missing keywords. Retrieval experiments show that the proposed methods yield a good results.

Full Paper

Bibliographic reference.  Masumura, Ryo / Hahm, Seongjun / Ito, Akinori (2011): "Language model expansion using webdata for spoken document retrieval", In INTERSPEECH-2011, 2133-2136.