AANG: Automating Auxiliary Learning
暂无分享,去创建一个
[1] Y. Zhuang,et al. Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain , 2022, Remote. Sens..
[2] Sanket Vaibhav Mehta,et al. ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning , 2021, ArXiv.
[3] Zhilin Yang,et al. NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework , 2021, ICML.
[4] Sang Michael Xie,et al. An Explanation of In-context Learning as Implicit Bayesian Inference , 2021, ICLR.
[5] Paul Michel,et al. Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative , 2021, ICLR.
[6] Yann Dauphin,et al. Auxiliary Task Update Decomposition: the Good, the Bad and the neutral , 2021, ICLR.
[7] Tri Dao,et al. Rethinking Neural Operations for Diverse Tasks , 2021, NeurIPS.
[8] Sonal Gupta,et al. Muppet: Massive Multi-task Representations with Pre-Finetuning , 2021, EMNLP.
[9] Sanjeev Arora,et al. A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks , 2020, ICLR.
[10] Vishrav Chaudhary,et al. Self-training Improves Pre-training for Natural Language Understanding , 2020, NAACL.
[11] Ethan Fetaya,et al. Auxiliary Learning by Implicit Differentiation , 2020, ICLR.
[12] Maria-Florina Balcan,et al. Geometry-Aware Gradient Algorithms for Neural Architecture Search , 2020, ICLR.
[13] Rong Jin,et al. Improved Fine-tuning by Leveraging Pre-training Data: Theory and Practice , 2021, ArXiv.
[14] Trevor Darrell,et al. Auxiliary Task Reweighting for Minimum-data Learning , 2020, NeurIPS.
[15] Dhruv Batra,et al. Auxiliary Tasks Speed Up Learning PointGoal Navigation , 2020, CoRL.
[16] Pierre H. Richemond,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[17] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[18] Atri Rudra,et al. Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps , 2020, ICLR.
[19] Doug Downey,et al. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.
[20] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[21] Xipeng Qiu,et al. Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.
[22] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[23] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[24] Omer Levy,et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.
[25] Daniel S. Weld,et al. S2ORC: The Semantic Scholar Open Research Corpus , 2020, ACL.
[26] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[27] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[28] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[29] Yi Yang,et al. Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Thomas Wolf,et al. Transfer Learning in Natural Language Processing , 2019, NAACL.
[31] Benno Stein,et al. SemEval-2019 Task 4: Hyperpartisan News Detection , 2019, *SEMEVAL.
[32] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[33] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[35] Iz Beltagy,et al. SciBERT: Pretrained Contextualized Embeddings for Scientific Text , 2019, ArXiv.
[36] Andrew J. Davison,et al. Self-Supervised Generalisation with Meta Auxiliary Learning , 2019, NeurIPS.
[37] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[38] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[39] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[40] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[41] David Held,et al. Adaptive Auxiliary Task Weighting for Reinforcement Learning , 2019, NeurIPS.
[42] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[43] Mari Ostendorf,et al. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction , 2018, EMNLP.
[44] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[45] Daniel Jurafsky,et al. Measuring the Evolution of a Scientific Field through Citation Frames , 2018, TACL.
[46] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[47] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.
[48] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.
[49] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[50] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[51] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[52] Alexei A. Efros,et al. What makes ImageNet good for transfer learning? , 2016, ArXiv.
[53] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[54] Saif Mohammad,et al. SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.
[55] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[56] Tudor I. Oprea,et al. ChemProt-3.0: a global chemical biology diseases mapping , 2016, Database J. Biol. Databases Curation.
[57] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[58] Xavier Carreras,et al. A Simple Named Entity Extractor using AdaBoost , 2003, CoNLL.
[59] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.
[60] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[61] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[62] Eugene Charniak,et al. Statistical Techniques for Natural Language Parsing , 1997, AI Mag..