Real-Time Single-Channel Deep Neural Network-Based Speech Enhancement on Edge Devices

Nikhil Shankar, Gautam Shreedhar Bhat, Issa M.S. Panahi


In this paper, we present a deep neural network architecture comprising of both convolutional neural network (CNN) and recurrent neural network (RNN) layers for real-time single-channel speech enhancement (SE). The proposed neural network model focuses on enhancing the noisy speech magnitude spectrum on a frame-by-frame process. The developed model is implemented on the smartphone (edge device), to demonstrate the real-time usability of the proposed method. Perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) test results are used to compare the proposed algorithm to previously published conventional and deep learning-based SE methods. Subjective ratings show the performance improvement of the proposed model over the other baseline SE methods.


 DOI: 10.21437/Interspeech.2020-1901

Cite as: Shankar, N., Bhat, G.S., Panahi, I.M. (2020) Real-Time Single-Channel Deep Neural Network-Based Speech Enhancement on Edge Devices. Proc. Interspeech 2020, 3281-3285, DOI: 10.21437/Interspeech.2020-1901.


@inproceedings{Shankar2020,
  author={Nikhil Shankar and Gautam Shreedhar Bhat and Issa M.S. Panahi},
  title={{Real-Time Single-Channel Deep Neural Network-Based Speech Enhancement on Edge Devices}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3281--3285},
  doi={10.21437/Interspeech.2020-1901},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1901}
}