In this paper we describe the design and recording process of the new telephone speech database recorded in Telefonica Investigation y Desarrollo, designed for research in large vocabulary speaker independent continuous speech recognition, speaker adaptation and speaker verification in Spanish over the telephone line. The database is composed of two sets: (a) CEUDEX, the main set, with a corpus of 400 phonetically balanced sentences, and (b) SPATIS: a task oriented set which was inspired in the ATIS (Air Travel Information System)  standard application for English. It will be used for Task-Independent tests of the Continuous Speech Recognizer. In the first stage of the recording procedure, a total of 21500 sentences from nearly 300 speakers were collected.
Bibliographic reference. Torre-Munilla, Celinda de la / Hernandez-Gomez, Luis / Tapias, Daniel (1995): "CEUDEX: a data base oriented to context-dependent units training in Spanish for continuous speech recognition", In EUROSPEECH-1995, 845-848.