论文信息 - Improving Back-Propagation by Adding an Adversarial Gradient

Improving Back-Propagation by Adding an Adversarial Gradient

The back-propagation algorithm is widely used for learning in artificial neural networks. A challenge in machine learning is to create models that generalize to new data samples not seen in the training data. Recently, a common flaw in several machine learning algorithms was discovered: small perturbations added to the input data lead to consistent misclassification of data samples. Samples that easily mislead the model are called adversarial examples. Training a "maxout" network on adversarial examples has shown to decrease this vulnerability, but also increase classification performance. This paper shows that adversarial training has a regularizing effect also in networks with logistic, hyperbolic tangent and rectified linear units. A simple extension to the back-propagation method is proposed, that adds an adversarial gradient to the training. The extension requires an additional forward and backward pass to calculate a modified input sample, or mini batch, used as input for standard back-propagation learning. The first experimental results on MNIST show that the "adversarial back-propagation" method increases the resistance to adversarial examples and boosts the classification performance. The extension reduces the classification error on the permutation invariant MNIST from 1.60% to 0.95% in a logistic network, and from 1.40% to 0.78% in a network with rectified linear units. Results on CIFAR-10 indicate that the method has a regularizing effect similar to dropout in fully connected networks. Based on these promising results, adversarial back-propagation is proposed as a stand-alone regularizing method that should be further investigated.

Arild Nøkland

[1] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[4] Shin Ishii,et al. Distributional Smoothing by Virtual Adversarial Examples , 2015, ICLR.

[5] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.

[6] Tapani Raiko,et al. Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[7] Pascal Frossard,et al. Analysis of classifiers’ robustness to adversarial perturbations , 2015, Machine Learning.

[8] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[10] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[11] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[13] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.

[14] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[15] Luca Rigazio,et al. Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[16] Kiyotoshi Matsuoka,et al. Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..

[17] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[19] Tapani Raiko,et al. Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[20] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.