Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

The OGI 22 Language Telephone Speech Corpus

T. Lander, Ronald A. Cole, B. T. Oshika, M. Noel

Center for Spoken Language Understanding, Oregon Graduate Institute, Portland, OR, USA

The 22 Language Telephone Speech Corpus is the newest effort in CSLU multi-language corpus development. Since November of 1994 we have been collecting calls from speakers of 22 languages. The completed corpus will contain at least 200 speakers per language. All calls are verified by native speakers to insure that callers followed instructions. In addition a subset of the calls have been transcribed by native speakers of the different languages. Conventions are being developed for phonetic transcription of a portion of the calls in each language. This corpus is useful for language identification research, as well as research into spoken language systems.

Full Paper

Bibliographic reference.  Lander, T. / Cole, Ronald A. / Oshika, B. T. / Noel, M. (1995): "The OGI 22 language telephone speech corpus", In EUROSPEECH-1995, 817-820.