暂无分享,去创建一个
Azalia Mirhoseini | Hanxiao Liu | Sudip Roy | James Laudon | Yanqi Zhou | Anna Goldie | Ming Zhong | Amirali Abdolrashidi | Qiumin Xu | Daniel Wong | Peter C. Ma | Azalia Mirhoseini | Hanxiao Liu | Anna Goldie | Yanqi Zhou | J. Laudon | Sudip Roy | Qiumin Xu | Ming Zhong | AmirAli Abdolrashidi | Daniel Wong
[1] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.
[2] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Baochun Li,et al. Spotlight: Optimizing Device Placement for Training Deep Neural Networks , 2018, ICML.
[4] Bruno A. Olshausen,et al. Superposition of many models into one , 2019, NeurIPS.
[5] Jure Leskovec,et al. Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.
[6] Quoc V. Le,et al. A Hierarchical Model for Device Placement , 2018, ICLR.
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.
[9] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[10] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[11] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..
[12] Jure Leskovec,et al. How Powerful are Graph Neural Networks? , 2018, ICLR.
[13] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[14] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[15] Hongzi Mao,et al. Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning , 2019, NeurIPS.
[16] Vinod Nair,et al. REGAL: Transfer Learning For Fast Optimization of Computation Graphs , 2019, ArXiv.
[17] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[18] Alexander Aiken,et al. Beyond Data and Model Parallelism for Deep Neural Networks , 2018, SysML.
[19] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[20] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[21] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[22] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.
[23] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[24] Samy Bengio,et al. Device Placement Optimization with Reinforcement Learning , 2017, ICML.