Enhancing Transferability of Black-Box Adversarial Attacks via Lifelong Learning for Speech Emotion Recognition Models

Zhao Ren, Jing Han, Nicholas Cummins, Björn W. Schuller


Well-designed adversarial examples can easily fool deep speech emotion recognition models into misclassifications. The transferability of adversarial attacks is a crucial evaluation indicator when generating adversarial examples to fool a new target model or multiple models. Herein, we propose a method to improve the transferability of black-box adversarial attacks using lifelong learning. First, black-box adversarial examples are generated by an atrous Convolutional Neural Network (CNN) model. This initial model is trained to attack a CNN target model. Then, we adapt the trained atrous CNN attacker to a new CNN target model using lifelong learning. We use this paradigm, as it enables multi-task sequential learning, which saves more memory space than conventional multi-task learning. We verify this property on an emotional speech database, by demonstrating that the updated atrous CNN model can attack all target models which have been learnt, and can better attack a new target model than an attack model trained on one target model only.


 DOI: 10.21437/Interspeech.2020-1869

Cite as: Ren, Z., Han, J., Cummins, N., Schuller, B.W. (2020) Enhancing Transferability of Black-Box Adversarial Attacks via Lifelong Learning for Speech Emotion Recognition Models. Proc. Interspeech 2020, 496-500, DOI: 10.21437/Interspeech.2020-1869.


@inproceedings{Ren2020,
  author={Zhao Ren and Jing Han and Nicholas Cummins and Björn W. Schuller},
  title={{Enhancing Transferability of Black-Box Adversarial Attacks via Lifelong Learning for Speech Emotion Recognition Models}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={496--500},
  doi={10.21437/Interspeech.2020-1869},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1869}
}