A Study on the Transferability of Adversarial Attacks in Sound Event Classification

An adversarial attack is an algorithm that perturbs the input of a machine learning model in an intelligent way in order to change the output of the model. An important property of adversarial attacks is transferability. According to this property, it is possible to generate adversarial perturbations on one model and apply it the input to fool the output of a different model. Our work focuses on studying the transferability of adversarial attacks in sound event classification. We are able to demonstrate differences in transferability properties from those observed in computer vision. We show that dataset normalization techniques such as z-score normalization does not affect the transferability of adversarial attacks and we show that techniques such as knowledge distillation do not increase the transferability of attacks.

[1]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[2]  Emmanouil Benetos,et al.  Robustness of Adversarial Attacks in Sound Event Classification , 2019 .

[3]  Yue Zhao,et al.  CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition , 2018, USENIX Security Symposium.

[4]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[5]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[6]  Bob L. Sturm,et al.  Deep Learning and Music Adversaries , 2015, IEEE Transactions on Multimedia.

[7]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[8]  Zhuolin Yang,et al.  Characterizing Audio Adversarial Examples Using Temporal Dependency , 2018, ICLR.

[9]  Patrick Cardinal,et al.  A Robust Approach for Securing Audio Classification Against Adversarial Attacks , 2019, IEEE Transactions on Information Forensics and Security.

[10]  Mark D. Plumbley,et al.  General-purpose audio tagging from noisy labels using convolutional neural networks , 2018, DCASE.

[11]  Daniel P. W. Ellis,et al.  General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline , 2018, DCASE.

[12]  Dorothea Kolossa,et al.  Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding , 2018, NDSS.

[13]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[14]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[15]  Colin Raffel,et al.  Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition , 2019, ICML.

[16]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.