Conv-TasSAN: Separative Adversarial Network Based on Conv-TasNet

Chengyun Deng, Yi Zhang, Shiqian Ma, Yongtao Sha, Hui Song, Xiangang Li

Conv-TasNet has showed competitive performance on single-channel speech source separation. In this paper, we investigate to further improve separation performance by optimizing the training mechanism with the same network structure. Motivated by the successful applications of generative adversarial networks (GANs) on speech enhancement tasks, we propose a novel Separative Adversarial Network called Conv-TasSAN, in which the separator is realized by using Conv-TasNet architecture. The discriminator is involved to optimize the separator with respect to specific speech objective metric. It makes the separator network capture the distribution information of speech sources more accurately, and also prevents over-smoothing problems. Experiments on WSJ0-2mix dataset confirm the superior performance of the proposed method over Conv-TasNet in terms of SI-SNR and PESQ improvement.

 DOI: 10.21437/Interspeech.2020-2371

Cite as: Deng, C., Zhang, Y., Ma, S., Sha, Y., Song, H., Li, X. (2020) Conv-TasSAN: Separative Adversarial Network Based on Conv-TasNet. Proc. Interspeech 2020, 2647-2651, DOI: 10.21437/Interspeech.2020-2371.

  author={Chengyun Deng and Yi Zhang and Shiqian Ma and Yongtao Sha and Hui Song and Xiangang Li},
  title={{Conv-TasSAN: Separative Adversarial Network Based on Conv-TasNet}},
  booktitle={Proc. Interspeech 2020},