Gradient Surgery for Multi-Task Learning
暂无分享,去创建一个
[1] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[2] Paul H. Calamai,et al. Projected gradient methods for linearly constrained problems , 1987, Math. Program..
[3] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[4] W. Fulton. Eigenvalues, invariant factors, highest weights, and Schubert calculus , 1999, math/9908012.
[5] Tom Heskes,et al. Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..
[6] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[7] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[8] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[9] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.
[10] Xiaoou Tang,et al. Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.
[11] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[12] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[13] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[14] Jianmin Wang,et al. Learning Multiple Tasks with Deep Relationship Networks , 2015, ArXiv.
[15] Dianhai Yu,et al. Multi-Task Learning for Multiple Language Translation , 2015, ACL.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS , 2015 .
[18] Andrea Vedaldi,et al. Integrated perception with recurrent multi-task neural networks , 2016, NIPS.
[19] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[20] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[21] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[22] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[25] Philip S. Yu,et al. Learning Multiple Tasks with Multilinear Relationship Networks , 2015, NIPS.
[26] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[27] Sebastian Nowozin,et al. Stabilizing Training of Generative Adversarial Networks through Regularization , 2017, NIPS.
[28] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[29] Yongxin Yang,et al. Trace Norm Regularised Deep Multi-Task Learning , 2016, ICLR.
[30] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[31] Iasonas Kokkinos,et al. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[34] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[35] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[37] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.
[38] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[39] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[40] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[41] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[42] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[43] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .
[44] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[45] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[46] Matthew Riemer,et al. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.
[47] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[48] Martin A. Riedmiller,et al. Regularized Hierarchical Policies for Compositional Transfer in Robotics , 2019, ArXiv.
[49] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.
[50] Andrew J. Davison,et al. End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[52] Iasonas Kokkinos,et al. Attentive Single-Tasking of Multiple Tasks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[54] Ignacio Cases,et al. Routing Networks and the Challenges of Modular and Compositional Computation , 2019, ArXiv.
[55] Yike Guo,et al. Regularizing Deep Multi-Task Networks using Orthogonal Gradients , 2019, ArXiv.
[56] Razvan Pascanu,et al. Ray Interference: a Source of Plateaus in Deep Reinforcement Learning , 2019, ArXiv.
[57] Razvan Pascanu,et al. Distilling Policy Distillation , 2019, AISTATS.
[58] Luc Van Gool,et al. Branched Multi-Task Networks: Deciding what layers to share , 2019, BMVC.
[59] Mingrui Liu,et al. Improved Schemes for Episodic Memory-based Lifelong Learning , 2019, NeurIPS.
[60] Mehrdad Farajtabar,et al. Orthogonal Gradient Descent for Continual Learning , 2019, AISTATS.