Learning Intonation Pattern Embeddings for Arabic Dialect Identification

Aitor Arronte Alvarez, Elsayed Sabry Abdelaal Issa

This article presents a full end-to-end pipeline for Arabic Dialect Identification (ADI) using intonation patterns and acoustic representations. Recent approaches to language and dialect identification use linguistic-aware deep architectures that are able to capture phonetic differences amongst languages and dialects. Specifically, in ADI tasks, different combinations of linguistic features and acoustic representations have been successful with deep learning models. The approach presented in this article uses intonation patterns and hybrid residual and bidirectional LSTM networks to learn acoustic embeddings with no additional linguistic information. Results of the experiments show that intonation patterns for Arabic dialects provide sufficient information to achieve state-of-the-art results on the VarDial 17 ADI dataset, outperforming single-feature systems. The pipeline presented is robust to data sparsity, in contrast to other deep learning approaches that require large quantities of data. We conjecture on the importance of sufficient information as a criterion for optimality in a deep learning ADI task, and more generally, its application to acoustic modeling problems. Small intonation patterns, when sufficient in an information-theoretic sense, allow deep learning architectures to learn more accurate speech representations.

 DOI: 10.21437/Interspeech.2020-2906

Cite as: Alvarez, A.A., Issa, E.S.A. (2020) Learning Intonation Pattern Embeddings for Arabic Dialect Identification. Proc. Interspeech 2020, 472-476, DOI: 10.21437/Interspeech.2020-2906.

  author={Aitor Arronte Alvarez and Elsayed Sabry Abdelaal Issa},
  title={{Learning Intonation Pattern Embeddings for Arabic Dialect Identification}},
  booktitle={Proc. Interspeech 2020},