Laughter Synthesis: Combining Seq2seq Modeling with Transfer Learning

Noé Tits, Kevin El Haddad, Thierry Dutoit


Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area. In this paper we propose an audio laughter synthesis system based on a sequence-to-sequence TTS synthesis system. We leverage transfer learning by training a deep learning model to learn to generate both speech and laughs from annotations. We evaluate our model with a listening test, comparing its performance to an HMM-based laughter synthesis one and assess that it reaches higher perceived naturalness. Our solution is a first step towards a TTS system that would be able to synthesize speech with a control on amusement level with laughter integration.


 DOI: 10.21437/Interspeech.2020-1423

Cite as: Tits, N., Haddad, K.E., Dutoit, T. (2020) Laughter Synthesis: Combining Seq2seq Modeling with Transfer Learning. Proc. Interspeech 2020, 3401-3405, DOI: 10.21437/Interspeech.2020-1423.


@inproceedings{Tits2020,
  author={Noé Tits and Kevin El Haddad and Thierry Dutoit},
  title={{Laughter Synthesis: Combining Seq2seq Modeling with Transfer Learning}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3401--3405},
  doi={10.21437/Interspeech.2020-1423},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1423}
}