论文信息 - Adversarial examples in the physical world

Adversarial examples in the physical world

Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifier still makes a mistake. Adversarial examples pose security concerns because they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier. This is not always the case for systems operating in the physical world, for example those which are using signals from cameras and other sensors as an input. This paper shows that even in such physical world scenarios, machine learning systems are vulnerable to adversarial examples. We demonstrate this by feeding adversarial images obtained from cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system. We find that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera.

[1] Pedro M. Domingos,et al. Adversarial classification , 2004, KDD.

[2] Blaine Nelson,et al. Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[3] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[5] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[6] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[7] Brian C. Lovell,et al. Face Recognition on Consumer Devices: Reflections on Replay Attacks , 2015, IEEE Transactions on Information Forensics and Security.

[8] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[9] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[13] Lujo Bauer,et al. Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[14] Ananthram Swami,et al. Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[15] Micah Sherr,et al. Hidden Voice Commands , 2016, USENIX Security Symposium.