Computer Audition for Continuous Rainforest Occupancy Monitoring: The Case of Bornean Gibbons’ Call Detection

Panagiotis Tzirakis, Alexander Shiarella, Robert Ewers, Björn W. Schuller


Auditory data is used by ecologists for a variety of purposes, including identifying species ranges, estimating population sizes, and studying behaviour. Autonomous recording units (ARUs) enable auditory data collection over a wider area, and can provide improved consistency over traditional sampling methods. The result is an abundance of audio data — much more than can be analysed by scientists with the appropriate taxonomic skills. In this paper, we address the divide between academic machine learning research on animal vocalisation classifiers, and their application to conservation efforts. As a unique case study, we build a Bornean gibbon call detection system by first manually annotating existing data, and then comparing audio analysis tool kits including end-to-end and bag-of-audio-word modelling. Finally, we propose a deep architecture that outperforms the other approaches with respect to unweighted average recall. The code is available at: https://github.com/glam-imperial/Bornean-Gibbons-Call-Detection


 DOI: 10.21437/Interspeech.2020-2655

Cite as: Tzirakis, P., Shiarella, A., Ewers, R., Schuller, B.W. (2020) Computer Audition for Continuous Rainforest Occupancy Monitoring: The Case of Bornean Gibbons’ Call Detection. Proc. Interspeech 2020, 1211-1215, DOI: 10.21437/Interspeech.2020-2655.


@inproceedings{Tzirakis2020,
  author={Panagiotis Tzirakis and Alexander Shiarella and Robert Ewers and Björn W. Schuller},
  title={{Computer Audition for Continuous Rainforest Occupancy Monitoring: The Case of Bornean Gibbons’ Call Detection}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={1211--1215},
  doi={10.21437/Interspeech.2020-2655},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2655}
}