Being Friends Instead of Adversaries: Deep Networks Learn from Data Simplified by Other Networks

Amongst a variety of approaches aimed at making the learning procedure of neural networks more effective, the scientific community developed strategies to order the examples according to their estimated complexity, to distil knowledge from larger networks, or to exploit the principles behind adversarial machine learning. A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation, with the goal of facilitating the learning process of a neural classifier. The transformation progressively fadesout as long as training proceeds, until it completely vanishes. In this work we revisit and extend this idea, introducing a radically different and novel approach inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning. We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier at the current stage of the training procedure. The auxiliary network is trained jointly with the neural classifier, thus intrinsically increasing the “depth” of the classifier, and it is expected to spot general regularities in the data alteration process. The effect of the auxiliary network is progressively reduced up to the end of training, when it is fully dropped and the classifier is deployed for applications. We refer to this approach as Neural Friendly Training. An extended experimental procedure involving several datasets and different neural architectures shows that Neural Friendly Training overcomes the originally proposed Friendly Training technique, improving the generalization of the classifier, especially in the case of noisy data.

[1]  Mohan S. Kankanhalli,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Honglak Lee,et al.  SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing , 2019, ECCV.

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[9]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[10]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[11]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[12]  Maoguo Gong,et al.  Self-paced Convolutional Neural Networks , 2017, IJCAI.

[13]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[14]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[15]  Hugo Larochelle,et al.  Curriculum By Smoothing , 2020, NeurIPS.

[16]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[17]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[18]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[19]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[20]  Marco Gori,et al.  Friendly Training: Neural Networks Can Adapt Data To Make Learning Easier , 2021, 2021 International Joint Conference on Neural Networks (IJCNN).

[21]  Christoph H. Lampert,et al.  Towards Understanding Knowledge Distillation , 2019, ICML.

[22]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.