This paper presents an analysis of the influence of various system parameters on the output quality of our neural network based real-time EMG-to-Speech conversion system. This EMG-to-Speech system allows for the direct conversion of facial surface electromyographic signals into audible speech in real time, allowing for a closed-loop setup where users get direct audio feedback. Such a setup opens new avenues for research and applications through co-adaptation approaches. In this paper, we evaluate the influence of several parameters on the output quality, such as time context, EMG-Audio delay, network-, training data- and Mel spectrogram size. The resulting output quality is evaluated based on the objective output quality measure STOI.
DOI: 10.21437/Interspeech.2018-2080
Cite as: Diener, L., Schultz, T. (2018) Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion. Proc. Interspeech 2018, 3162-3166, DOI: 10.21437/Interspeech.2018-2080.
@inproceedings{Diener2018, author={Lorenz Diener and Tanja Schultz}, title={Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion}, year=2018, booktitle={Proc. Interspeech 2018}, pages={3162--3166}, doi={10.21437/Interspeech.2018-2080}, url={http://dx.doi.org/10.21437/Interspeech.2018-2080} }