A Deep Learning-Based Kalman Filter for Speech Enhancement

Sujan Kumar Roy, Aaron Nicolson, Kuldip K. Paliwal


The existing Kalman filter (KF) suffers from poor estimates of the noise variance and the linear prediction coefficients (LPCs) in real-world noise conditions. This results in a degraded speech enhancement performance. In this paper, a deep learning approach is used to more accurately estimate the noise variance and LPCs, enabling the KF to enhance speech in various noise conditions. Specifically, a deep learning approach to MMSE-based noise power spectral density (PSD) estimation, called DeepMMSE, is used. The estimated noise PSD is used to compute the noise variance. We also construct a whitening filter with its coefficients computed from the estimated noise PSD. It is then applied to the noisy speech, yielding pre-whitened speech for computing the LPCs. The improved noise variance and LPC estimates enable the KF to minimise the residual noise and distortion in the enhanced speech. Experimental results show that the proposed method exhibits higher quality and intelligibility in the enhanced speech than the benchmark methods in various noise conditions for a wide-range of SNR levels.


 DOI: 10.21437/Interspeech.2020-1551

Cite as: Roy, S.K., Nicolson, A., Paliwal, K.K. (2020) A Deep Learning-Based Kalman Filter for Speech Enhancement. Proc. Interspeech 2020, 2692-2696, DOI: 10.21437/Interspeech.2020-1551.


@inproceedings{Roy2020,
  author={Sujan Kumar Roy and Aaron Nicolson and Kuldip K. Paliwal},
  title={{A Deep Learning-Based Kalman Filter for Speech Enhancement}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2692--2696},
  doi={10.21437/Interspeech.2020-1551},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1551}
}