EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

Recognition of Slovenian Speech: Within and Cross-Language Experiments on Monophones using the SpeechDat(II)

Andrej Iskra (1), Bojan Petek (1), Tom Brøndsted (2)

(1) University of Ljubljana, Slovenia
(2) Aalborg University, Denmark

Though the Slovenian SpeechDat(II) database is the largest spoken language resources for Slovenian ever recorded, it belongs to the smaller speech data collections made available by the European LE2-4001 project (http://www.speechdat.org/). The aim of this paper is to analyze this new Slovenian resource and explore the possibilities of supplementing it with data recorded for other languages. The donor languages being considered are English, German, and Danish. For each of these languages four time as much speech data has been recorded (4000 speakers compared to the Slovenian 1000 speaker database). Our purely data-driven cross language tests show that serious problems are involved when porting data across languages. The problems are partly due to differences in the recording conditions (telephone line noise). Other problems can be explained by the different phonological structures of the analyzed languages.

Full Paper

Bibliographic reference.  Iskra, Andrej / Petek, Bojan / Brøndsted, Tom (2001): "Recognition of slovenian speech: within and cross-language experiments on monophones using the speechdat(II)", In EUROSPEECH-2001, 2777-2780.