Partial AUC Optimisation Using Recurrent Neural Networks for Music Detection with Limited Training Data

Pablo Gimeno, Victoria Mingote, Alfonso Ortega, Antonio Miguel, Eduardo Lleida


State-of-the-art music detection systems, whose aim is to distinguish whether or not music is present in an audio signal, rely mainly on deep learning approaches. However, these kind of solutions are strongly dependent on the amount of data they were trained on. In this paper, we introduce the area under the ROC curve (AUC) and partial AUC (pAUC) optimisation techniques, recently developed for neural networks, into the music detection task, seeking to overcome the issues derived from data limitation. Using recurrent neural networks as the main element in the system and with a limited training set of around 20 hours of audio, we explore different approximations to threshold-independent training objectives. Furthermore, we propose a novel training objective based on the decomposition of the area under the ROC curve as the sum of two partial areas under the ROC curve. Experimental results show that partial AUC optimisation can improve the performance of music detection systems significantly compared to traditional training criteria such as cross entropy.


 DOI: 10.21437/Interspeech.2020-1108

Cite as: Gimeno, P., Mingote, V., Ortega, A., Miguel, A., Lleida, E. (2020) Partial AUC Optimisation Using Recurrent Neural Networks for Music Detection with Limited Training Data. Proc. Interspeech 2020, 3067-3071, DOI: 10.21437/Interspeech.2020-1108.


@inproceedings{Gimeno2020,
  author={Pablo Gimeno and Victoria Mingote and Alfonso Ortega and Antonio Miguel and Eduardo Lleida},
  title={{Partial AUC Optimisation Using Recurrent Neural Networks for Music Detection with Limited Training Data}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3067--3071},
  doi={10.21437/Interspeech.2020-1108},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1108}
}