Adversarial Latent Representation Learning for Speech Enhancement

Yuanhang Qiu, Ruili Wang


This paper proposes a novel adversarial latent representation learning (ALRL) method for speech enhancement. Based on adversarial feature learning, ALRL employs an extra encoder to learn an inverse mapping from the generated data distribution to the latent space. The encoder builds an inner connection with the generator, and provides relevant latent information for adversarial feature modelling. A new loss function is proposed to implement the encoder mapping simultaneously. In addition, the multi-head self-attention is also applied to the encoder for learning of long-range dependencies and further effective adversarial representations. The experimental results demonstrate that ALRL outperforms current GAN-based speech enhancement methods.


 DOI: 10.21437/Interspeech.2020-1593

Cite as: Qiu, Y., Wang, R. (2020) Adversarial Latent Representation Learning for Speech Enhancement. Proc. Interspeech 2020, 2662-2666, DOI: 10.21437/Interspeech.2020-1593.


@inproceedings{Qiu2020,
  author={Yuanhang Qiu and Ruili Wang},
  title={{Adversarial Latent Representation Learning for Speech Enhancement}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2662--2666},
  doi={10.21437/Interspeech.2020-1593},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1593}
}