A Recursive Network with Dynamic Attention for Monaural Speech Enhancement

Andong Li, Chengshi Zheng, Cunhang Fan, Renhua Peng, Xiaodong Li


For continuous speech processing, dynamic attention is helpful in preferential processing, which has already been shown by the auditory dynamic attending theory. Accordingly, we propose a framework combining dynamic attention and recursive learning together called DARCN for monaural speech enhancement. Apart from a major noise reduction network, we design a separated sub-network, which adaptively generates the attention distribution to control the information flow throughout the major network. Recursive learning is introduced to dynamically reduce the number of trainable parameters by reusing a network for multiple stages, where the intermediate output in each stage is refined with a memory mechanism. By doing so, a more flexible and better estimation can be obtained. We conduct experiments on TIMIT corpus. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art models in terms of both PESQ and STOI scores.


 DOI: 10.21437/Interspeech.2020-1513

Cite as: Li, A., Zheng, C., Fan, C., Peng, R., Li, X. (2020) A Recursive Network with Dynamic Attention for Monaural Speech Enhancement. Proc. Interspeech 2020, 2422-2426, DOI: 10.21437/Interspeech.2020-1513.


@inproceedings{Li2020,
  author={Andong Li and Chengshi Zheng and Cunhang Fan and Renhua Peng and Xiaodong Li},
  title={{A Recursive Network with Dynamic Attention for Monaural Speech Enhancement}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2422--2426},
  doi={10.21437/Interspeech.2020-1513},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1513}
}