Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Development of Spoken Language Corpora for Travel Information

Lori Lamel, S. Rosset, S. Bennacef, H. Bonneau-Maynard, L. Devillers, Jean-Luc Gauvain

LIMSI - CNRS, Orsay, France

In this paper we report on our ongoing work in developing spoken language corpora in the context of information access in two travel domain tasks, l'Atis and Mask. The collection of spoken language corpora remains an important research area and represents a significant portion of work in the development of spoken language systems. The use of additional acoustic and language model training data has been shown to almost systematically improve performance in continuous speech recognition. Similarly, progress in spoken language understanding is closely linked to the availability of spoken language corpora. We record subjects on a regular basis using development versions of the spoken language systems for both tasks, obtaining over 1000 queries/month from 20 subjects. To help assess our progress in system development, each subject since March'95 completes a questionnaire addressing the user-friendliness, reliability, ease-of-use of the Mask data collection system.

Full Paper

Bibliographic reference.  Lamel, Lori / Rosset, S. / Bennacef, S. / Bonneau-Maynard, H. / Devillers, L. / Gauvain, Jean-Luc (1995): "Development of spoken language corpora for travel information", In EUROSPEECH-1995, 1961-1964.