The 22 Language Telephone Speech Corpus is the newest effort in CSLU multi-language corpus development. Since November of 1994 we have been collecting calls from speakers of 22 languages. The completed corpus will contain at least 200 speakers per language. All calls are verified by native speakers to insure that callers followed instructions. In addition a subset of the calls have been transcribed by native speakers of the different languages. Conventions are being developed for phonetic transcription of a portion of the calls in each language. This corpus is useful for language identification research, as well as research into spoken language systems.
Bibliographic reference. Lander, T. / Cole, Ronald A. / Oshika, B. T. / Noel, M. (1995): "The OGI 22 language telephone speech corpus", In EUROSPEECH-1995, 817-820.