CSL-EMG_Array: An Open Access Corpus for EMG-to-Speech Conversion

Lorenz Diener, Mehrdad Roustay Vishkasougheh, Tanja Schultz


We present a new open access corpus for the training and evaluation of EMG-to-Speech conversion systems based on array electromyographic recordings. The corpus is recorded with a recording paradigm closely mirroring realistic EMG-to-Speech usage scenarios, and includes evaluation data recorded from both audible as well as silent speech. The corpus consists of 9.5 hours of data, split into 12 sessions recorded from 8 speakers. Based on this corpus, we present initial benchmark results with a realistic online EMG-to-Speech conversion use case, both for the audible and silent speech subsets. We also present a method for drastically improving EMG-to-Speech system stability and performance in the presence of time-related artifacts.


 DOI: 10.21437/Interspeech.2020-2859

Cite as: Diener, L., Vishkasougheh, M.R., Schultz, T. (2020) CSL-EMG_Array: An Open Access Corpus for EMG-to-Speech Conversion. Proc. Interspeech 2020, 3745-3749, DOI: 10.21437/Interspeech.2020-2859.


@inproceedings{Diener2020,
  author={Lorenz Diener and Mehrdad Roustay Vishkasougheh and Tanja Schultz},
  title={{CSL-EMG_Array: An Open Access Corpus for EMG-to-Speech Conversion}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3745--3749},
  doi={10.21437/Interspeech.2020-2859},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2859}
}