Making pull-based graph processing performant
暂无分享,去创建一个
Christoforos E. Kozyrakis | Heiner Litz | Samuel Grossman | C. Kozyrakis | Heiner Litz | Samuel Grossman
[1] Trishul M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality , 2001, PLDI '01.
[2] Alexander S. Szalay,et al. FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs , 2014, FAST.
[3] Barbara M. Chapman,et al. A Runtime Implementation of OpenMP Tasks , 2011, IWOMP.
[4] David A. Patterson,et al. Direction-optimizing Breadth-First Search , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[5] Alexandru Iosup,et al. How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[6] Eli Upfal,et al. A simple load balancing scheme for task allocation in parallel machines , 1991, SPAA '91.
[7] Binyu Zang,et al. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs , 2019, TOPC.
[8] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[9] Margaret Martonosi,et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Katherine A. Yelick,et al. Optimizing parallel programs with explicit synchronization , 1995, PLDI '95.
[11] Xing Liu,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.
[12] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[13] Pradeep Dubey,et al. GraphMat: High performance graph analytics made productive , 2015, Proc. VLDB Endow..
[14] Chau-Wen Tseng,et al. A Comparison of Locality Transformations for Irregular Codes , 2000, LCR.
[15] Larry Carter,et al. Rescheduling for Locality in Sparse Matrix Computations , 2001, International Conference on Computational Science.
[16] Wenguang Chen,et al. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.
[17] Carlos Guestrin,et al. Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .
[18] D. Patterson,et al. Searching for a Parent Instead of Fighting Over Children : A Fast Breadth-First Search Implementation for Graph 500 , 2011 .
[19] David A. Patterson,et al. Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server , 2015, 2015 IEEE International Symposium on Workload Characterization.
[20] Marco Rosa,et al. Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.
[21] Panos Kalnis,et al. Mizan: a system for dynamic load balancing in large-scale graph processing , 2013, EuroSys '13.
[22] Lu Yao,et al. Implementing Sparse Matrix-Vector multiplication using CUDA based on a hybrid sparse matrix format , 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).
[23] Weimin Zheng,et al. Exploring the Hidden Dimension in Graph Processing , 2016, OSDI.
[24] Mario Szegedy,et al. A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing , 2017, SIGMETRICS.
[25] Haibo Chen,et al. SYNC or ASYNC: time to fuse for distributed graph-parallel computation , 2015, PPoPP.
[26] Guy E. Blelloch,et al. GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.
[27] Chen Ding,et al. Program locality analysis using reuse distance , 2009, TOPL.
[28] Alberto Montresor,et al. An evaluation study of BigData frameworks for graph processing , 2013, 2013 IEEE International Conference on Big Data.
[29] Jure Leskovec,et al. {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .
[30] Jonathan W. Berry,et al. Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..
[31] Arutyun Avetisyan,et al. Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures , 2010, HiPEAC.
[32] Willy Zwaenepoel,et al. Chaos: scale-out graph processing from secondary storage , 2015, SOSP.
[33] Lawrence Rauchwerger,et al. The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.
[34] Pavel Tvrdík,et al. Evaluation Criteria for Sparse Matrix Storage Formats , 2016, IEEE Transactions on Parallel and Distributed Systems.
[35] Francisco F. Rivera,et al. Exploiting locality in the run-time parallelization of irregular loops , 2002, Proceedings International Conference on Parallel Processing.
[36] Wencong Xiao,et al. GraM: scaling graph computation to the trillions , 2015, SoCC.
[37] Dimitrios S. Nikolopoulos,et al. Accelerating Graph Analytics by Utilising the Memory Locality of Graph Partitioning , 2017, 2017 46th International Conference on Parallel Processing (ICPP).
[38] Christos Faloutsos,et al. R-MAT: A Recursive Model for Graph Mining , 2004, SDM.
[39] Sebastiano Vigna,et al. The webgraph framework I: compression techniques , 2004, WWW '04.
[40] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[41] Reynold Xin,et al. GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.
[42] Chen Ding,et al. Software behavior oriented parallelization , 2007, PLDI '07.
[43] Anthony P. Reeves,et al. Strategies for Dynamic Load Balancing on Highly Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..
[44] Keshav Pingali,et al. A lightweight infrastructure for graph analytics , 2013, SOSP.
[45] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[46] Jennifer Widom,et al. GPS: a graph processing system , 2013, SSDBM.
[47] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.
[48] John R. Gilbert,et al. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks , 2009, SPAA '09.
[49] John D. Owens,et al. Gunrock: a high-performance graph processing library on the GPU , 2015, PPoPP.
[50] Guy E. Blelloch,et al. Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.
[51] Chau-Wen Tseng,et al. Exploiting locality for irregular scientific codes , 2006, IEEE Transactions on Parallel and Distributed Systems.
[52] Juan Touriño,et al. An Inspector-Executor Algorithm for Irregular Assignment Parallelization , 2004, ISPA.
[53] Ming Wu,et al. Managing Large Graphs on Multi-Cores with Graph Awareness , 2012, USENIX Annual Technical Conference.
[54] Christos Faloutsos,et al. PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.
[55] M. Tamer Özsu,et al. An Experimental Comparison of Pregel-like Graph Processing Systems , 2014, Proc. VLDB Endow..
[56] Chen Ding,et al. Array regrouping and structure splitting using whole-program reference affinity , 2004, PLDI '04.
[57] Kunle Olukotun,et al. Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[58] Haibo Chen,et al. NUMA-aware graph-structured analytics , 2015, PPoPP.
[59] Lawrence Rauchwerger,et al. The R-LRPD test: speculative parallelization of partially parallel loops , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[60] Willy Zwaenepoel,et al. X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.
[61] Samuel Williams,et al. Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.