Improving Back-Propagation by Adding an Adversarial Gradient

The back-propagation algorithm is widely used for learning in artificial neural networks. A challenge in machine learning is to create models that generalize to new data samples not seen in the training data. Recently, a common flaw in several machine learning algorithms was discovered: small perturbations added to the input data lead to consistent misclassification of data samples. Samples that easily mislead the model are called adversarial examples. Training a "maxout" network on adversarial examples has shown to decrease this vulnerability, but also increase classification performance. This paper shows that adversarial training has a regularizing effect also in networks with logistic, hyperbolic tangent and rectified linear units. A simple extension to the back-propagation method is proposed, that adds an adversarial gradient to the training. The extension requires an additional forward and backward pass to calculate a modified input sample, or mini batch, used as input for standard back-propagation learning. The first experimental results on MNIST show that the "adversarial back-propagation" method increases the resistance to adversarial examples and boosts the classification performance. The extension reduces the classification error on the permutation invariant MNIST from 1.60% to 0.95% in a logistic network, and from 1.40% to 0.78% in a network with rectified linear units. Results on CIFAR-10 indicate that the method has a regularizing effect similar to dropout in fully connected networks. Based on these promising results, adversarial back-propagation is proposed as a stand-alone regularizing method that should be further investigated.

[1]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[4]  Shin Ishii,et al.  Distributional Smoothing by Virtual Adversarial Examples , 2015, ICLR.

[5]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[6]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[7]  Pascal Frossard,et al.  Analysis of classifiers’ robustness to adversarial perturbations , 2015, Machine Learning.

[8]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[10]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[13]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[14]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[15]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[16]  Kiyotoshi Matsuoka,et al.  Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[19]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[20]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.