This paper describes a study of grapheme-based speech recognition for colloquial Arabic. An investigation of language and acoustic model configurations is carried out to illustrate the differences between colloquial and modern standard Arabic (MSA) on the example of Levantine telephone conversations. The study defines extensive and carefully crafted data sets for different dialects and studies their overlap with MSA sources. The use of grapheme models is re-investigated, and alternative configuration for acoustic models to correct obvious shortcomings are tested. The recognition performance was analyzed on two levels: corpuslevel and dialect-level. In addition modifications of dictionaries to allow better specification of sound patterns is explored. Overall the experiments highlight the need for higher level information on acoustic model selection.
Bibliographic reference. Al-Shareef, Sarah / Hain, Thomas (2011): "An investigation in speech recognition for colloquial Arabic", In INTERSPEECH-2011, 2869-2872.