Open-Source Consumer-Grade Indic Text To Speech

Andrew Wilkinson, Alok Parlikar, Sunayana Sitaram, Tim White, Alan W. Black, Suresh Bazaj


Open-source text-to-speech (TTS) software has enabled the development of voices in multiple languages, including many high-resource languages, such as English and European languages. However, building voices for low-resource languages is still challenging. We describe the development of TTS systems for 12 Indian languages using the Festvox framework, for which we developed a common frontend for Indian languages. Voices for eight of these 12 languages are available for use with Flite, a lightweight, fast run-time synthesizer, and the Android Flite app available in the Google Play store. Recently, the baseline Punjabi TTS voice was built end-to-end in a month by two undergraduate students (without any prior knowledge of TTS) with help from two of the authors of this paper. The framework can be used to build a baseline Indic TTS voice in two weeks, once a text corpus is selected and a suitable native speaker is identified.


DOI: 10.21437/SSW.2016-31

Cite as

Wilkinson, A., Parlikar, A., Sitaram, S., White, T., Black, A.W., Bazaj, S. (2016) Open-Source Consumer-Grade Indic Text To Speech. Proc. 9th ISCA Speech Synthesis Workshop, 190-195.

Bibtex
@inproceedings{Wilkinson+2016,
author={Andrew Wilkinson and Alok Parlikar and Sunayana Sitaram and Tim White and Alan W. Black and Suresh Bazaj},
title={Open-Source Consumer-Grade Indic Text To Speech},
year=2016,
booktitle={9th ISCA Speech Synthesis Workshop},
doi={10.21437/SSW.2016-31},
url={http://dx.doi.org/10.21437/SSW.2016-31},
pages={190--195}
}