CATOTRON — A Neural Text-to-Speech System in Catalan

Baybars Külebi, Alp Öktem, Alex Peiró-Lilja, Santiago Pascual, Mireia Farrús


We present Catotron, a neural network-based open-source speech synthesis system in Catalan. Catotron consists of a sequence-to-sequence model trained with two small open-source datasets based on semi-spontaneous and read speech. We demonstrate how a neural TTS can be built for languages with limited resources using found-data optimization and cross-lingual transfer learning. We make the datasets, initial models and source code publicly available for both commercial and research purposes.


Cite as: Külebi, B., Öktem, A., Peiró-Lilja, A., Pascual, S., Farrús, M. (2020) CATOTRON — A Neural Text-to-Speech System in Catalan. Proc. Interspeech 2020, 490-491.


@inproceedings{Külebi2020,
  author={Baybars Külebi and Alp Öktem and Alex Peiró-Lilja and Santiago Pascual and Mireia Farrús},
  title={{CATOTRON — A Neural Text-to-Speech System in Catalan}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={490--491}
}