Attention-Driven Projections for Soundscape Classification

Dhanunjaya Varma Devalraju, Muralikrishna H., Padmanabhan Rajan, Dileep Aroor Dinesh


Acoustic soundscapes can be made up of background sound events and foreground sound events. Many times, either the background (or the foreground) may provide useful cues in discriminating one soundscape from another. A part of the background or a part of the foreground can be suppressed by using subspace projections. These projections can be learnt by utilising the framework of robust principal component analysis. In this work, audio signals are represented as embeddings from a convolutional neural network, and meta-embeddings are derived using an attention mechanism. This representation enables the use of class-specific projections for effective suppression, leading to good discrimination. Our experimental evaluation demonstrates the effectiveness of the method on standard datasets for acoustic scene classification.


 DOI: 10.21437/Interspeech.2020-2476

Cite as: Devalraju, D.V., H., M., Rajan, P., Dinesh, D.A. (2020) Attention-Driven Projections for Soundscape Classification. Proc. Interspeech 2020, 1206-1210, DOI: 10.21437/Interspeech.2020-2476.


@inproceedings{Devalraju2020,
  author={Dhanunjaya Varma Devalraju and Muralikrishna H. and Padmanabhan Rajan and Dileep Aroor Dinesh},
  title={{Attention-Driven Projections for Soundscape Classification}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={1206--1210},
  doi={10.21437/Interspeech.2020-2476},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2476}
}