Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms

Steffen Illium, Robert Müller, Andreas Sedlmeier, Claudia Linnhoff-Popien


In many fields of research, labeled data-sets are hard to acquire. This is where data augmentation promises to overcome the lack of training data in the context of neural network engineering and classification tasks. The idea here is to reduce model over-fitting to the feature distribution of a small under-descriptive training data-set. We try to evaluate such data augmentation techniques to gather insights in the performance boost they provide for several convolutional neural networks on mel-spectrogram representations of audio data. We show the impact of data augmentation on the binary classification task of surgical mask detection in samples of human voice ( ComParE Challenge 2020). Also we consider four varying architectures to account for augmentation robustness. Results show that most of the baselines given by ComParE are outperformed.


 DOI: 10.21437/Interspeech.2020-1692

Cite as: Illium, S., Müller, R., Sedlmeier, A., Linnhoff-Popien, C. (2020) Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms. Proc. Interspeech 2020, 2052-2056, DOI: 10.21437/Interspeech.2020-1692.


@inproceedings{Illium2020,
  author={Steffen Illium and Robert Müller and Andreas Sedlmeier and Claudia Linnhoff-Popien},
  title={{Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2052--2056},
  doi={10.21437/Interspeech.2020-1692},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1692}
}