暂无分享,去创建一个
[1] Thomas Wolf,et al. A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks , 2018, AAAI.
[2] C. E. Lemke,et al. Bimatrix Equilibrium Points and Mathematical Programming , 1965 .
[3] Li Fei-Fei,et al. Dynamic Task Prioritization for Multitask Learning , 2018, ECCV.
[4] Graham Neubig,et al. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.
[5] Yulia Tsvetkov,et al. Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models , 2020, ICLR.
[6] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[7] Laurent El Ghaoui,et al. Robust Optimization , 2021, ICORES.
[8] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[9] Matthew Riemer,et al. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.
[10] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[13] Zhaolin Hu,et al. Kullback-Leibler divergence constrained distributionally robust optimization , 2012 .
[14] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2021, NAACL.
[15] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[16] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[17] Paul Michel,et al. Examining and Combating Spurious Features under Distribution Shift , 2021, ICML.
[18] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[19] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[20] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[21] Eliyahu Kiperwasser,et al. Scheduled Multi-Task Learning: From Syntax to Translation , 2018, TACL.
[22] Sebastian Ruder,et al. Neural transfer learning for natural language processing , 2019 .
[23] W. Matusik,et al. Effcient Continuous Pareto Exploration in Multi-Task Learning , 2020 .
[24] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[25] Yingyu Liang,et al. Loss-Balanced Task Weighting to Reduce Negative Transfer in Multi-Task Learning , 2019, AAAI.
[26] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[27] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[28] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[29] Percy Liang,et al. Distributionally Robust Language Modeling , 2019, EMNLP.
[30] Percy Liang,et al. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.
[31] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[32] Taku Kudo,et al. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.
[33] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.
[34] Yair Carmon,et al. Large-Scale Methods for Distributionally Robust Optimization , 2020, NeurIPS.
[35] Ryan Cotterell,et al. What Kind of Language Is Hard to Language-Model? , 2019, ACL.
[36] F. Jelinek,et al. Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .
[37] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.
[38] Guy Jumarie,et al. Relative Information — What For? , 1990 .
[39] Rongrong Ji,et al. Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[41] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[42] J. N. Kapur,et al. Entropy Optimization Principles and Their Applications , 1992 .