A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing
暂无分享,去创建一个
Mario Szegedy | Yan-Hao Chen | Ari B. Hayes | Eddy Z. Zhang | Lingda Li | Robel Geda | Pranav Chaudhari | M. Szegedy | Yan-Hao Chen | Robel Geda | E. Zhang | Lingda Li | Pranav Chaudhari
[1] Charalampos E. Tsourakakis,et al. FENNEL: streaming graph partitioning for massive scale graphs , 2014, WSDM.
[2] Bo Wu,et al. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU , 2013, PPoPP '13.
[3] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[4] Robert Krauthgamer,et al. Partitioning graphs into balanced components , 2009, SODA.
[5] Troels Blum,et al. Fusion of parallel array operations , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[6] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[7] Marc Lelarge,et al. Balanced graph edge partition , 2014, KDD.
[8] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[9] Tamara G. Kolda,et al. Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing , 1999, SIAM J. Sci. Comput..
[10] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2004, J. Parallel Distributed Comput..
[11] Richard F. Barrett,et al. Matrix Market: a web resource for test matrix collections , 1996, Quality of Numerical Software.
[12] Carlos Guestrin,et al. Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .
[13] Xipeng Shen,et al. On-the-fly elimination of dynamic irregularities for GPU computing , 2011, ASPLOS XVI.
[14] Albert,et al. Emergence of scaling in random networks , 1999, Science.
[15] Olcay Polat,et al. A parallel variable neighborhood search for the vehicle routing problem with divisible deliveries and pickups , 2017, Comput. Oper. Res..
[16] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[17] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[18] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[19] Joseph Naor,et al. Fast approximate graph partitioning algorithms , 1997, SODA '97.
[20] Rainald Löhner,et al. Running unstructured grid‐based CFD solvers on modern graphics hardware , 2009 .
[21] Tamara G. Kolda,et al. Graph partitioning models for parallel computing , 2000, Parallel Comput..
[22] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[23] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[24] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.
[25] George Karypis,et al. Multilevel Hypergraph Partitioning , 2003 .