Are you Wearing a Mask? Improving Mask Detection from Speech Using Augmentation by Cycle-Consistent GANs

Nicolae-Cătălin Ristea, Radu Tudor Ionescu


The task of detecting whether a person wears a face mask from speech is useful in modelling speech in forensic investigations, communication between surgeons or people protecting themselves against infectious diseases such as COVID-19. In this paper, we propose a novel data augmentation approach for mask detection from speech. Our approach is based on (i) training Generative Adversarial Networks (GANs) with cycle-consistency loss to translate unpaired utterances between two classes (with mask and without mask), and on (ii) generating new training utterances using the cycle-consistent GANs, assigning opposite labels to each translated utterance. Original and translated utterances are converted into spectrograms which are provided as input to a set of ResNet neural networks with various depths. The networks are combined into an ensemble through a Support Vector Machines (SVM) classifier. With this system, we participated in the Mask Sub-Challenge (MSC) of the INTERSPEECH 2020 Computational Paralinguistics Challenge, surpassing the baseline proposed by the organizers by 2.8%. Our data augmentation technique provided a performance boost of 0.9% on the private test set. Furthermore, we show that our data augmentation approach yields better results than other baseline and state-of-the-art augmentation methods.


 DOI: 10.21437/Interspeech.2020-1329

Cite as: Ristea, N., Ionescu, R.T. (2020) Are you Wearing a Mask? Improving Mask Detection from Speech Using Augmentation by Cycle-Consistent GANs. Proc. Interspeech 2020, 2102-2106, DOI: 10.21437/Interspeech.2020-1329.


@inproceedings{Ristea2020,
  author={Nicolae-Cătălin Ristea and Radu Tudor Ionescu},
  title={{Are you Wearing a Mask? Improving Mask Detection from Speech Using Augmentation by Cycle-Consistent GANs}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2102--2106},
  doi={10.21437/Interspeech.2020-1329},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1329}
}