Malayalam-English Code-Switched: Grapheme to Phoneme System

Sreeja Manghat, Sreeram Manghat, Tanja Schultz


Grapheme to phoneme conversion is an integral aspect of speech processing. Conversational speech in Malayalam — a low resource Indic language has inter-sentential, intra-sentential code-switching as well as frequent intra-word code-switching with English. Monolingual G2P systems cannot process such special intra-word code-switching scenarios. A G2P system which can handle code-switching developed based on Malayalam-English code-switch speech and text corpora is presented. Since neither Malayalam nor English are phonetic subset of each other, the overlapping phonemes for English–Malayalam are identified and analysed. Additional rules used to handle special cases of Malayalam phonemes and intra-word code-switching in the G2P system is also presented specifically.


 DOI: 10.21437/Interspeech.2020-1936

Cite as: Manghat, S., Manghat, S., Schultz, T. (2020) Malayalam-English Code-Switched: Grapheme to Phoneme System. Proc. Interspeech 2020, 4133-4137, DOI: 10.21437/Interspeech.2020-1936.


@inproceedings{Manghat2020,
  author={Sreeja Manghat and Sreeram Manghat and Tanja Schultz},
  title={{Malayalam-English Code-Switched: Grapheme to Phoneme System}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={4133--4137},
  doi={10.21437/Interspeech.2020-1936},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1936}
}