In Defense of the Unitary Scalarization for Deep Multi-Task Learning
暂无分享,去创建一个
[1] J. Gilmer,et al. Do Current Multi-Task Optimization Methods in Deep Learning Even Help? , 2022, NeurIPS.
[2] Ethan Fetaya,et al. Multi-Task Learning as a Bargaining Game , 2022, ICML.
[3] Peter Stone,et al. Conflict-Averse Gradient Descent for Multi-task Learning , 2021, NeurIPS.
[4] Qiang Yang,et al. Multi-Task Learning in Natural Language Processing: An Overview , 2021, ArXiv.
[5] S. Levine,et al. MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale , 2021, ArXiv.
[6] I. Valera,et al. RotoGrad: Gradient Homogenization in Multitask Learning , 2021, ICLR.
[7] Andrea Lodi,et al. Combinatorial optimization and reasoning with graph neural networks , 2021, IJCAI.
[8] Joelle Pineau,et al. Multi-Task Reinforcement Learning with Context-based Representations , 2021, ICML.
[9] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, J. Artif. Intell. Res..
[10] Shimon Whiteson,et al. Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? , 2020, NeurIPS.
[11] Dragomir Anguelov,et al. Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout , 2020, NeurIPS.
[12] Yulia Tsvetkov,et al. Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models , 2020, ICLR.
[13] Tim Rocktäschel,et al. My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control , 2020, ICLR.
[14] Wenlong Huang,et al. One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control , 2020, ICML.
[15] Daniel Ulbricht,et al. Learning to Branch for Multi-Task Learning , 2020, ICML.
[16] Wouter Van Gansbeke,et al. Multi-Task Learning for Dense Prediction Tasks: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Timothy M. Hospedales,et al. Meta-Learning in Neural Networks: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Matthew E. Taylor,et al. Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..
[19] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[20] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[21] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[22] Samet Oymak,et al. Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks , 2019, AISTATS.
[23] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[24] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[25] Li Fei-Fei,et al. Dynamic Task Prioritization for Multitask Learning , 2018, ECCV.
[26] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[27] Andrew J. Davison,et al. End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Yuanzhi Li,et al. An Alternative View: When Does SGD Escape Local Minima? , 2018, ICML.
[29] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[30] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[31] Raef Bassily,et al. The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning , 2017, ICML.
[32] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.
[33] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[34] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[35] Thomas A. Funkhouser,et al. Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[38] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[39] Iasonas Kokkinos,et al. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] A. Gupta,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[43] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[45] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[46] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[47] Christian Szegedy,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[48] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[49] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[50] Jasha Droppo,et al. Multi-task learning in deep neural networks for improved phoneme recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[51] J. Désidéri. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization , 2012 .
[52] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[53] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.
[54] Tom Heskes,et al. Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..
[55] Jörg Fliege,et al. Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..
[56] Tom Heskes,et al. Empirical Bayes for Learning to Learn , 2000, ICML.
[57] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[58] Thomas G. Dietterich. Overfitting and undercomputing in machine learning , 1995, CSUR.
[59] Yu Zhang,et al. A Closer Look at Loss Weighting in Multi-Task Learning , 2021, ArXiv.
[60] Qingmin Liao,et al. Towards Impartial Multi-task Learning , 2021, ICLR.
[61] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[62] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[63] Rich Caruana,et al. Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.
[64] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.