论文信息 - Learning Data Augmentation with Online Bilevel Optimization for Image Classification

Learning Data Augmentation with Online Bilevel Optimization for Image Classification

Data augmentation is a key practice in machine learning for improving generalization performance. However, finding the best data augmentation hyperparameters requires domain knowledge or a computationally demanding search. We address this issue by proposing an efficient approach to automatically train a network that learns an effective distribution of transformations to improve its generalization score. Using bilevel optimization, we directly optimize the data augmentation parameters using a validation set. This framework can be used as a general solution to learn the optimal data augmentation jointly with an end task model like a classifier. Results show that our joint training method produces an image classification accuracy that is comparable to or better than carefully hand-crafted data augmentation. Yet, it does not need an expensive external validation loop on the data augmentation hyperparameters.

[1] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[2] Peter König,et al. Data augmentation instead of explicit regularization , 2018, ArXiv.

[3] Christopher Ré,et al. Learning to Compose Domain-Specific Transformations for Data Augmentation , 2017, NIPS.

[4] Roger B. Grosse,et al. Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions , 2019, ICLR.

[5] Dmytro Mishkin,et al. Kornia: an Open Source Differentiable Computer Vision Library for PyTorch , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[7] Yoshua Bengio,et al. Generative Adversarial Networks , 2014, ArXiv.

[8] Paolo Frasconi,et al. Forward and Reverse Gradient-Based Hyperparameter Optimization , 2017, ICML.

[9] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[10] Wei Wu,et al. Online Hyper-Parameter Learning for Auto-Augmentation Strategy , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Jost Tobias Springenberg,et al. Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[12] Stefano Soatto,et al. Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence , 2019, NeurIPS.

[13] Jonathon Shlens,et al. Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[14] Ismail Ben Ayed,et al. Adversarial Learning of General Transformations for Data Augmentation , 2019, ArXiv.

[15] Fan Yang,et al. Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[16] Graham W. Taylor,et al. Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[17] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[18] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[19] Quoc V. Le,et al. RandAugment: Practical data augmentation with no separate search , 2019, ArXiv.

[20] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[21] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[23] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[24] Ryan P. Adams,et al. Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[25] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26] Yoshua Bengio,et al. Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[27] Justin Domke,et al. Generic Methods for Optimization-Based Modeling , 2012, AISTATS.

[28] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[29] Ion Stoica,et al. Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules , 2019, ICML.

[30] Gustavo Carneiro,et al. A Bayesian Data Augmentation Approach for Learning Deep Models , 2017, NIPS.

[31] Paolo Frasconi,et al. Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[32] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[33] Hideki Nakayama,et al. Faster AutoAugment: Learning Augmentation Strategies using Backpropagation , 2019, ECCV.

[34] Amos J. Storkey,et al. Augmenting Image Classifiers Using Data Augmentation Generative Adversarial Networks , 2018, ICANN.

[35] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[36] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[39] Peter Corcoran,et al. Smart Augmentation Learning an Optimal Data Augmentation Strategy , 2017, IEEE Access.

[40] Andreas Krause,et al. Discriminative Clustering by Regularized Information Maximization , 2010, NIPS.

[41] Trevor Darrell,et al. Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42] Patrice Marcotte,et al. An overview of bilevel optimization , 2007, Ann. Oper. Res..

[43] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[44] Taesup Kim,et al. Fast AutoAugment , 2019, NeurIPS.

[45] Fabian Pedregosa,et al. Hyperparameter optimization with approximate gradient , 2016, ICML.

[46] Jun Zhu,et al. Triple Generative Adversarial Nets , 2017, NIPS.

[47] Peter König,et al. Do deep nets really need weight decay and dropout? , 2018, ArXiv.

[48] Luis Perez,et al. The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[49] Tapani Raiko,et al. Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters , 2015, ICML.

[50] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[51] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[52] Quoc V. Le,et al. AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).