暂无分享,去创建一个
Yann Dauphin | David Grangier | Lucio M. Dery | David Grangier | Yann Dauphin | L. Dery | Y. Dauphin
[1] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[2] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[3] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[4] Mark Tygert,et al. A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..
[5] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[7] William D. Lewis,et al. Intelligent Selection of Language Model Training Data , 2010, ACL.
[8] Jianfeng Gao,et al. Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.
[9] Brendan T. O'Connor,et al. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics , 2011 .
[10] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[11] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..
[12] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[13] Xiaoou Tang,et al. Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.
[14] Ya Le,et al. Tiny ImageNet Visual Recognition Challenge , 2015 .
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Anton van den Hengel,et al. Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.
[17] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[21] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[22] John C. Duchi,et al. Variance-based Regularization with Convex Objectives , 2016, NIPS.
[23] Yuji Nakatsukasa,et al. Accuracy of singular vectors obtained by projection-based SVD methods , 2017 .
[24] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[25] Barak A. Pearlmutter,et al. Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..
[26] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.
[27] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[28] Zhao Chen,et al. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.
[29] Zhao Chen,et al. Gradient Adversarial Training of Neural Networks , 2018, ArXiv.
[30] Quoc V. Le,et al. Domain Adaptive Transfer Learning with Specialist Models , 2018, ArXiv.
[31] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .
[32] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[33] Matthew Riemer,et al. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.
[34] Amit Kumar Jaiswal,et al. Identifying pneumonia in chest X-rays: A deep learning approach , 2019, Measurement.
[35] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[36] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[37] Julien Mairal,et al. Unsupervised Pre-Training of Image Features on Non-Curated Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[38] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.
[39] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[40] Yifan Yu,et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.
[41] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] David Held,et al. Adaptive Auxiliary Task Weighting for Reinforcement Learning , 2019, NeurIPS.
[43] Yike Guo,et al. Regularizing Deep Multi-Task Networks using Orthogonal Gradients , 2019, ArXiv.
[44] Jon Kleinberg,et al. Transfusion: Understanding Transfer Learning for Medical Imaging , 2019, NeurIPS.
[45] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[46] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[47] Xinyi Wang,et al. Optimizing Data Usage via Differentiable Rewards , 2019, ICML.
[48] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[49] Luc Van Gool,et al. Revisiting Multi-Task Learning in the Deep Learning Era , 2020, ArXiv.
[50] Ye Tian,et al. Learning a Multi-Domain Curriculum for Neural Machine Translation , 2020, ACL.
[51] Mehrdad Farajtabar,et al. Orthogonal Gradient Descent for Continual Learning , 2019, AISTATS.
[52] Doug Downey,et al. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.