暂无分享,去创建一个
[1] Kurt Keutzer,et al. Integrated Model, Batch, and Domain Parallelism in Training Neural Networks , 2017, SPAA.
[2] Zhiyuan Liu,et al. Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.
[3] Leonid Oliker,et al. Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[4] Ah Chung Tsoi,et al. The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.
[5] Tommi Vatanen,et al. Structure-Based Function Prediction using Graph Convolutional Networks , 2019, bioRxiv.
[6] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[7] Philip S. Yu,et al. A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[8] Marc Snir,et al. Channel and filter parallelism for large-scale CNN training , 2019, SC.
[9] John R. Gilbert,et al. On the representation and multiplication of hypersparse matrices , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[10] Torsten Hoefler,et al. Demystifying Parallel and Distributed Deep Learning , 2018, ACM Comput. Surv..
[11] Alexander Aiken,et al. Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc , 2020, MLSys.
[12] Ramesh C. Agarwal,et al. A three-dimensional approach to parallel matrix multiplication , 1995, IBM J. Res. Dev..
[13] Alexander J. Smola,et al. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.
[14] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[15] Samuel Williams,et al. Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[16] Yafei Dai,et al. NeuGraph: Parallel Deep Neural Network Computation on Large Graphs , 2019, USENIX ATC.
[17] Jan Eric Lenssen,et al. Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.
[18] Dan Alistarh,et al. SparCML: high-performance sparse communication for machine learning , 2018, SC.
[19] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.
[20] Georgios A. Pavlopoulos,et al. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks , 2018, Nucleic acids research.
[21] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[22] Jure Leskovec,et al. How Powerful are Graph Neural Networks? , 2018, ICLR.
[23] John D. Owens,et al. Design Principles for Sparse Matrix Multiplication on the GPU , 2018, Euro-Par.
[24] Marc Snir,et al. Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[25] Chang Zhou,et al. AliGraph: A Comprehensive Graph Neural Network Platform , 2019, Proc. VLDB Endow..
[26] Peter Sanders,et al. Recent Advances in Graph Partitioning , 2013, Algorithm Engineering.
[27] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[28] Robert A. van de Geijn,et al. Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..
[29] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[30] Samuel Williams,et al. Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication , 2015, SIAM J. Sci. Comput..
[31] James Demmel,et al. Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[32] Alexander Peysakhovich,et al. PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.
[33] James Demmel,et al. Communication optimal parallel multiplication of sparse random matrices , 2013, SPAA.
[34] John R. Gilbert,et al. The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..
[35] Dustin Tran,et al. Mesh-TensorFlow: Deep Learning for Supercomputers , 2018, NeurIPS.