Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
暂无分享,去创建一个
Zhe Zhao | Ed H. Chi | Ed Huai-hsin Chi | Xinyang Yi | Jiaqi Ma | Lichan Hong | Jilin Chen | Lichan Hong | Xinyang Yi | Jilin Chen | Zhe Zhao | Jiaqi Ma
[1] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[2] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[4] Yongxin Yang,et al. Deep Multi-task Representation Learning: A Tensor Factorisation Approach , 2016, ICLR.
[5] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[6] Paul Covington,et al. Deep Neural Networks for YouTube Recommendations , 2016, RecSys.
[7] Trevor Cohn,et al. Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser , 2015, ACL.
[8] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[9] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.
[10] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] George Karypis,et al. Multi-task learning for recommender systems , 2010, ACML 2010.
[12] Jascha Sohl-Dickstein,et al. Capacity and Trainability in Recurrent Neural Networks , 2016, ICLR.
[13] Shai Ben-David,et al. Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.
[14] Kristen Grauman,et al. Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.
[15] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[16] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[17] George Karypis,et al. Multi-task Learning for Recommender System , 2010, ACML.
[18] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Itamar Arel,et al. Low-Rank Approximations for Conditional Feedforward Computation in Deep Neural Networks , 2013, ICLR.
[21] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.
[22] Rich Caruana,et al. Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.
[23] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[24] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[25] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[26] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[27] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.
[28] Lawrence Carin,et al. Learning Structured Weight Uncertainty in Bayesian Neural Networks , 2017, AISTATS.
[29] Shai Ben-David,et al. A theoretical framework for learning from a pool of disparate data sources , 2002, KDD.
[30] T. Ben-David,et al. Exploiting Task Relatedness for Multiple , 2003 .
[31] Andrew McCallum,et al. Ask the GRU: Multi-task Learning for Deep Text Recommendations , 2016, RecSys.
[32] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[33] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[34] Marc'Aurelio Ranzato,et al. Learning Factored Representations in a Deep Mixture of Experts , 2013, ICLR.
[35] Zhe Zhao,et al. Improving User Topic Interest Profiles by Behavior Factorization , 2015, WWW.