Constrained Ratio Mask for Speech Enhancement Using DNN

Hongjiang Yu, Wei-Ping Zhu, Yuhong Yang


Speech enhancement has found many applications concerning robust speech processing. A masking based algorithm, as an important method of speech enhancement, aims to retain the speech dominant components and suppress the noise dominant parts of the noisy speech. In this paper, we derive a new type of mask: constrained ratio mask (CRM), which can better control the trade-off between speech distortion and residual noise in the enhanced speech. A deep neural network (DNN) is then employed for CRM estimation in noisy conditions. The estimated CRM is finally applied to the noisy speech for denoising. Experimental results show that the enhanced speech from the new masking scheme yields an improved speech quality over three existing masks under various noisy conditions.


 DOI: 10.21437/Interspeech.2020-1920

Cite as: Yu, H., Zhu, W., Yang, Y. (2020) Constrained Ratio Mask for Speech Enhancement Using DNN. Proc. Interspeech 2020, 2427-2431, DOI: 10.21437/Interspeech.2020-1920.


@inproceedings{Yu2020,
  author={Hongjiang Yu and Wei-Ping Zhu and Yuhong Yang},
  title={{Constrained Ratio Mask for Speech Enhancement Using DNN}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2427--2431},
  doi={10.21437/Interspeech.2020-1920},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1920}
}