U-Net Based Direct-Path Dominance Test for Robust Direction-of-Arrival Estimation

Hao Wang, Kai Chen, Jing Lu


It has been noted that the identification of the time-frequency bins dominated by the contribution from the direct propagation of the target speaker can significantly improve the robustness of the direction-of-arrival estimation. However, the correct extraction of the direct-path sound is challenging especially in adverse environments. In this paper, a U-net based direct-path dominance test method is proposed. Exploiting the efficient segmentation capability of the U-net architecture, the direct-path information can be effectively retrieved from a dedicated multi-task neural network. Moreover, the training and inference of the neural network only need the input of a single microphone, circumventing the problem of array-structure dependence faced by common end-to-end deep learning based methods. Simulations demonstrate that significantly higher estimation accuracy can be achieved in high reverberant and low signal-to-noise ratio environments.


 DOI: 10.21437/Interspeech.2020-2493

Cite as: Wang, H., Chen, K., Lu, J. (2020) U-Net Based Direct-Path Dominance Test for Robust Direction-of-Arrival Estimation. Proc. Interspeech 2020, 5086-5090, DOI: 10.21437/Interspeech.2020-2493.


@inproceedings{Wang2020,
  author={Hao Wang and Kai Chen and Jing Lu},
  title={{U-Net Based Direct-Path Dominance Test for Robust Direction-of-Arrival Estimation}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={5086--5090},
  doi={10.21437/Interspeech.2020-2493},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2493}
}