An NMF-HMM Speech Enhancement Method Based on Kullback-Leibler Divergence

Yang Xiang, Liming Shi, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, Mads Græsbøll Christensen


In this paper, we present a novel supervised Non-negative Matrix Factorization (NMF) speech enhancement method, which is based on Hidden Markov Model (HMM) and Kullback-Leibler (KL) divergence (NMF-HMM). Our algorithm applies the HMM to capture the timing information, so the temporal dynamics of speech signal can be considered by comparing with the traditional NMF-based speech enhancement method. More specifically, the sum of Poisson, leading to the KL divergence measure, is used as the observation model for each state of HMM. This ensures that the parameter update rule of the proposed algorithm is identical to the multiplicative update rule, which is quick and efficient. In the training stage, this update rule is applied to train the NMF-HMM model. In the online enhancement stage, a novel minimum mean-square error (MMSE) estimator that combines the NMF-HMM is proposed to conduct speech enhancement. The performance of the proposed algorithm is evaluated by perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI). The experimental results indicate that the STOI score of proposed strategy is able to outperform 7% than current state-of-the-art NMF-based speech enhancement methods.


 DOI: 10.21437/Interspeech.2020-1047

Cite as: Xiang, Y., Shi, L., Højvang, J.L., Rasmussen, M.H., Christensen, M.G. (2020) An NMF-HMM Speech Enhancement Method Based on Kullback-Leibler Divergence. Proc. Interspeech 2020, 2667-2671, DOI: 10.21437/Interspeech.2020-1047.


@inproceedings{Xiang2020,
  author={Yang Xiang and Liming Shi and Jesper Lisby Højvang and Morten Højfeldt Rasmussen and Mads Græsbøll Christensen},
  title={{An NMF-HMM Speech Enhancement Method Based on Kullback-Leibler Divergence}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2667--2671},
  doi={10.21437/Interspeech.2020-1047},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1047}
}