论文信息 - Towards Deep Neural Network Architectures Robust to Adversarial Examples

Towards Deep Neural Network Architectures Robust to Adversarial Examples

Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and explore network topology, pre-processing and training strategies to improve the robustness of DNNs. We perform various experiments to assess the removability of adversarial examples by corrupting with additional noise and pre-processing with denoising autoencoders (DAEs). We find that DAEs can remove substantial amounts of the adversarial noise. How- ever, when stacking the DAE with the original DNN, the resulting network can again be attacked by new adversarial examples with even smaller distortion. As a solution, we propose Deep Contractive Network, a model with a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE). This increases the network robustness to adversarial examples, without a significant performance penalty.

Luca Rigazio | Shixiang Gu | S. Gu | Luca Rigazio

[1] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..

[2] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[3] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[5] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.

[6] C. Koch,et al. Recurrent excitation in neocortical circuits , 1995, Science.

[7] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Jürgen Schmidhuber,et al. Deep Networks with Internal Selective Attention through Feedback Connections , 2014, NIPS.

[9] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Yichuan Tang,et al. Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[11] Ryan P. Adams,et al. Avoiding pathologies in very deep networks , 2014, AISTATS.

[12] Trevor Darrell,et al. Learning with Recursive Perceptual Representations , 2012, NIPS.

[13] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.

[14] D. J. Felleman,et al. Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[15] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[16] Volodymyr Mnih,et al. CUDAMat: a CUDA-based matrix class for Python , 2009 .

[17] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Trevor Darrell,et al. PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Yichuan Tang,et al. Deep Learning using Support Vector Machines , 2013, ArXiv.

[20] Pascal Vincent,et al. Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.

[21] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[22] Yoshua Bengio,et al. Marginalized Denoising Auto-encoders for Nonlinear Representations , 2014, ICML.

[23] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[24] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .