论文信息 - Auxiliary Learning by Implicit Differentiation

Auxiliary Learning by Implicit Differentiation

Training with multiple auxiliary tasks is a common practice used in deep learning for improving the performance on the main task of interest. Two main challenges arise in this multi-task learning setting: (i) Designing useful auxiliary tasks; and (ii) Combining auxiliary tasks into a single coherent loss. We propose a novel framework, \textit{AuxiLearn}, that targets both challenges, based on implicit differentiation. First, when useful auxiliaries are known, we propose learning a network that combines all losses into a single coherent objective function. This network can learn \textit{non-linear} interactions between auxiliary tasks. Second, when no useful auxiliary task is known, we describe how to learn a network that generates a meaningful, novel auxiliary task. We evaluate AuxiLearn in a series of tasks and domains, including image segmentation and learning with attributes. We find that AuxiLearn consistently improves accuracy compared with competing methods.

Gal Chechik | Ethan Fetaya | Aviv Navon | Haggai Maron | Idan Achituve

[1] Fabian Pedregosa,et al. Hyperparameter optimization with approximate gradient , 2016, ICML.

[2] Yoshua Bengio,et al. Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[3] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[4] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[6] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.

[7] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[8] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Jonathan Krause,et al. 3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[10] Gal Chechik,et al. Self-Supervised Learning for Domain Adaptation on Point Clouds , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[12] Andrew J. Davison,et al. End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14] Andrew J. Davison,et al. Self-Supervised Generalisation with Meta Auxiliary Learning , 2019, NeurIPS.

[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[18] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[19] Lisa Zhang,et al. Reviving and Improving Recurrent Back-Propagation , 2018, ICML.

[20] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[21] C. V. Jawahar,et al. Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Chuan-Sheng Foo,et al. Efficient multiple hyperparameter learning for log-linear models , 2007, NIPS.

[23] Yue Wang,et al. Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[24] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.

[26] Tapani Raiko,et al. Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters , 2015, ICML.

[27] Yu Zhang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29] Xing Fan,et al. Transfer Learning for Neural Semantic Parsing , 2017, Rep4NLP@ACL.

[30] David Duvenaud,et al. Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[31] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[32] J. Larsen,et al. Design and regularization of neural networks: the optimal use of a validation set , 1996, Neural Networks for Signal Processing VI. Proceedings of the 1996 IEEE Signal Processing Society Workshop.

[33] Leonidas J. Guibas,et al. A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[34] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.

[35] Jitendra Malik,et al. Which Tasks Should Be Learned Together in Multi-task Learning? , 2019, ICML.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.

[38] Jonathan Sauder,et al. Self-Supervised Deep Learning on Point Clouds by Reconstructing Space , 2019, NeurIPS.

[39] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.

[40] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[41] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[42] Kaveh Hassani,et al. Unsupervised Multi-Task Feature Learning on Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43] Xiaoou Tang,et al. Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[44] David Held,et al. Adaptive Auxiliary Task Weighting for Reinforcement Learning , 2019, NeurIPS.

[45] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[46] K. Jia,et al. Improving Semantic Analysis on Point Clouds via Auxiliary Supervision of Local Geometric Priors , 2020, IEEE Transactions on Cybernetics.

[47] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[48] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[49] Abhinav Gupta,et al. Scaling and Benchmarking Self-Supervised Visual Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50] Yann LeCun,et al. Indoor Semantic Segmentation using depth information , 2013, ICLR.

[51] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[53] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[54] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[55] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .

[56] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[57] Piotr Mirowski. Learning to Navigate , 2019 .

[58] Quoc V. Le,et al. Learning Longer-term Dependencies in RNNs with Auxiliary Losses , 2018, ICML.

[59] Yingli Tian,et al. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[61] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[62] Jana Kosecka,et al. Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).