Friendly Training: Neural Networks Can Adapt Data To Make Learning Easier

In the last decade, motivated by the success of Deep Learning, the scientific community proposed several approaches to make the learning procedure of Neural Networks more effective. When focussing on the way in which the training data are provided to the learning machine, we can distinguish between the classic random selection of stochastic gradient-based optimization and more involved techniques that devise curricula to organize data, and progressively increase the complexity of the training set. In this paper, we propose a novel training procedure named Friendly Training that, differently from the aforementioned approaches, involves altering the training examples in order to help the model to better fulfil its learning criterion. The model is allowed to “simplify” those examples that are too hard to be classified at a certain stage of the training procedure. The data transformation is controlled by a developmental plan that progressively reduces its impact during training, until it completely vanishes. In a sense, this is the opposite of what is commonly done in order to increase robustness against adversarial examples, i.e., Adversarial Training. Experiments on multiple datasets are provided, showing that Friendly Training yields improvements with respect to informed data sub-selection routines and random selection, especially in deep convolutional architectures. Results suggest that adapting the input data is a feasible way to stabilize learning and improve the generalization skills of the network.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Daphna Weinshall,et al.  On The Power of Curriculum Learning in Training Deep Networks , 2019, ICML.

[3]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[4]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[5]  Stefano Soatto,et al.  Critical Learning Periods in Deep Networks , 2018, ICLR.

[6]  Wei Liu,et al.  Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.

[7]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[8]  Maoguo Gong,et al.  Self-paced Convolutional Neural Networks , 2017, IJCAI.

[9]  Mohan S. Kankanhalli,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[10]  Franz Pernkopf,et al.  General Stochastic Networks for Classification , 2014, NIPS.

[11]  G. Peterson A day of great illumination: B. F. Skinner's discovery of shaping. , 2004, Journal of the experimental analysis of behavior.

[12]  Jacek Tabor,et al.  The Break-Even Point on Optimization Trajectories of Deep Neural Networks , 2020, ICLR.

[13]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[14]  L. S. Vygotskiĭ,et al.  Mind in society : the development of higher psychological processes , 1978 .

[15]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[19]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[20]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23]  Behnam Neyshabur,et al.  When Do Curricula Work? , 2021, ICLR.

[24]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[25]  Hugo Larochelle,et al.  Curriculum By Smoothing , 2020, NeurIPS.