Distributional Smoothing by Virtual Adversarial Examples

We propose a novel regularization technique for supervised and semi-supervised training of large models like deep neural network. By including into objective function the local smoothness of predictive distribution around each training data point, not only were we able to extend the work of (Goodfellow et al. (2015)) to the setting of semi-supervised training, we were also able to eclipse current state of the art supervised and semi-supervised methods on the permutation invariant MNIST classification task.

[1]  G. Golub,et al.  Eigenvalue computation in the 20th century , 2000 .

[2]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[3]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[5]  Pascal Vincent,et al.  The Manifold Tangent Classifier , 2011, NIPS.

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[8]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[9]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[10]  Philip Bachman,et al.  Learning with Pseudo-Ensembles , 2014, NIPS.

[11]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[12]  Tapani Raiko,et al.  Lateral Connections in Denoising Autoencoders Support Supervised Learning , 2015, ArXiv.

[13]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.