UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech

Meredith Moore, Piyush Papreja, Michael Saxon, Visar Berisha, Sethuraman Panchanathan

To facilitate more accessible spoken language technologies and advance the study of dysphonic speech this paper presents UncommonVoice, a freely-available, crowd-sourced speech corpus consisting of 8.5 hours of speech from 57 individuals, 48 of whom have spasmodic dysphonia. The speech material consists of non-words (prolonged vowels, and the prompt for diadochokinetic rate), sentences (randomly selected from TIMIT prompts and the CAPE-V intelligibility analysis), and spontaneous image descriptions. The data was recorded in a crowdsourced manner using a web-based application. This dataset is a fundamental resource for the development of voice-assistive technologies for individuals with dysphonia as well as the enhancement of the accessibility of voice-based technologies (automatic speech recognition, virtual assistants, etc). Research on articulation differences as well as how best to model and represent dysphonic speech will greatly benefit from a free and publicly available dataset of dysphonic speech. The dataset will be made freely and publicly available at www.uncommonvoice.org. In the following sections, we detail the data collection process as well as provide an initial analysis of the speech corpus.

 DOI: 10.21437/Interspeech.2020-3093

Cite as: Moore, M., Papreja, P., Saxon, M., Berisha, V., Panchanathan, S. (2020) UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech. Proc. Interspeech 2020, 2532-2536, DOI: 10.21437/Interspeech.2020-3093.

  author={Meredith Moore and Piyush Papreja and Michael Saxon and Visar Berisha and Sethuraman Panchanathan},
  title={{UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech}},
  booktitle={Proc. Interspeech 2020},