All Together Now: The Living Audio Dataset

David A. Braude, Matthew P. Aylett, Caoimhín Laoide-Kemp, Simone Ashby, Kristen M. Scott, Brian Ó Raghallaigh, Anna Braudo, Alex Brouwer, Adriana Stan

The ongoing focus in speech technology research on machine learning based approaches leaves the community hungry for data. However, datasets tend to be recorded once and then released, sometimes behind registration requirements or paywalls. In this paper we describe our Living Audio Dataset. The aim is to provide audio data that is in the public domain, multilingual, and expandable by communities. We discuss the role of linguistic resources, given the success of systems such as Tacotron which use direct text-to-speech mappings, and consider how data provenance could be built into such resources. So far the data has been collected for TTS purposes, however, it is also suitable for ASR. At the time of publication audio resources already exist for Dutch, R.P. English, Irish, and Russian.

 DOI: 10.21437/Interspeech.2019-2448

Braude, D.A., Aylett, M.P., Laoide-Kemp, C., Ashby, S., Scott, K.M., Raghallaigh, B.Ó., Braudo, A., Brouwer, A., Stan, A. (2019) All Together Now: The Living Audio Dataset. Proc. Interspeech 2019, 1521-1525, DOI: 10.21437/Interspeech.2019-2448.

