Low-Latency Single Channel Speech Dereverberation Using U-Net Convolutional Neural Networks

Ahmet E. Bulut, Kazuhito Koishida


Speech signal reverberation due to reflections in a physical obstacle is one of the main difficulties in speech processing as well as the presence of non-stationary background noise. In this study we explore DNN-based single-channel speech dereverberation with state-of-the-art performance comparisons. We propose a CNN auto-encoder architecture with skip connections focusing on real-time and low-latency applications. The proposed system is evaluated with the REVERB challenge dataset that includes simulated and real reverberated speech samples. Our experimental results show that the proposed system has superior results on the challenge evaluation dataset as opposed to a baseline system that uses deep neural network (DNN) based weighted prediction error (WPE) algorithm. We also extend the comparison with state of the art systems in terms of most commonly used objective metrics and our system achieves better results in the most of objective metrics. Moreover a latency analysis of the proposed system is performed and trade-off between processing time and performance is examined.


 DOI: 10.21437/Interspeech.2020-2421

Cite as: Bulut, A.E., Koishida, K. (2020) Low-Latency Single Channel Speech Dereverberation Using U-Net Convolutional Neural Networks. Proc. Interspeech 2020, 2442-2446, DOI: 10.21437/Interspeech.2020-2421.


@inproceedings{Bulut2020,
  author={Ahmet E. Bulut and Kazuhito Koishida},
  title={{Low-Latency Single Channel Speech Dereverberation Using U-Net Convolutional Neural Networks}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2442--2446},
  doi={10.21437/Interspeech.2020-2421},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2421}
}