论文信息 - Graph Processing on GPUs - 字舞流文

Graph Processing on GPUs

-. Qiang | Sheng | Xuanhua Shi | Ligang He | Zhigao Zheng | Bo Liu | Yongluan Zhou | Haici Jin | Qiangsheng Hua | Ligang He

[1] Elwood S. Buffa,et al. Graph Theory with Applications , 1977 .

[2] Udo Hahn,et al. Computing text Constituency: An Algorithmic Approach to the Generation of Text Graphs , 1984, SIGIR.

[3] Edward A. Lee,et al. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[4] Gul A. Agha,et al. ACTORS - a model of concurrent computation in distributed systems , 1985, MIT Press series in artificial intelligence.

[5] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.

[6] Guy E. Blelloch,et al. Vector Models for Data-Parallel Computing , 1990 .

[7] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[8] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[9] Matthew Richardson,et al. The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[10] Jens H. Krüger,et al. A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[11] Tor M. Aamodt,et al. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[12] P. J. Narayanan,et al. Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[13] Richard T. Watson,et al. The centrality and prestige of CACM , 2008, CACM.

[14] P. J. Narayanan,et al. CUDA cuts: Fast graph cuts on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[15] Michael Garland,et al. Eﬃcient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[16] Naga K. Govindaraju,et al. Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17] Joseph T. Kider,et al. All-pairs shortest-paths for large graphs on the GPU , 2008, GH '08.

[18] P J Narayanan,et al. Fast minimum spanning tree for large graphs on the GPU , 2009, High Performance Graphics.

[19] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[20] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[21] James Demmel,et al. the Parallel Computing Landscape , 2022 .

[22] Gregory E. Chamitoff,et al. Orders-of-magnitude performance increases in GPU-accelerated correlation of images from the International Space Station , 2010, Journal of Real-Time Image Processing.

[23] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[24] Arutyun Avetisyan,et al. Implementing Blocked Sparse Matrix-Vector Multiplication on NVIDIA GPUs , 2009, SAMOS.

[25] John R. Gilbert,et al. Solving path problems on the GPU , 2010, Parallel Comput..

[26] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[27] William Gropp,et al. An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.

[28] Kai Li,et al. Fidelity and scaling of the PARSEC benchmark inputs , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[29] Hong Chen,et al. Parallel SimRank computation on large graphs with iterative aggregation , 2010, KDD.

[30] Bin Wu,et al. Cloud-based Connected Component Algorithm , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[31] Borko Furht,et al. Exploring NVIDIA-CUDA for video coding , 2010, MMSys '10.

[32] Kevin Skadron,et al. Dynamic warp subdivision for integrated branch and memory divergence tolerance , 2010, ISCA.

[33] P. J. Narayanan,et al. A fast GPU algorithm for graph connectivity , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[34] Feng Yan,et al. Efficient PageRank and SpMV Computation on AMD GPUs , 2010, 2010 39th International Conference on Parallel Processing.

[35] Kunle Olukotun,et al. Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.

[36] Emmett Kilgariff,et al. Fermi GF100 GPU Architecture , 2011, IEEE Micro.

[37] Kunle Olukotun,et al. Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[38] Bingsheng He,et al. Mars: Accelerating MapReduce with Graphics Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.

[39] Rob H. Bisseling,et al. A GPU Algorithm for Greedy Graph Matching , 2011, Facing the Multicore-Challenge.

[40] Onur Mutlu,et al. Improving GPU performance via large warps and two-level warp scheduling , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[41] Kurt Keutzer,et al. clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs , 2012, ICS '12.

[42] Andrew S. Grimshaw,et al. Scalable GPU graph traversal , 2012, PPoPP '12.

[43] Arnon Rungsawang,et al. Fast PageRank Computation on a GPU Cluster , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[44] Guy E. Blelloch,et al. GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[45] Nicolas Brunie,et al. Simultaneous branch and warp interweaving for sustained GPU performance , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[46] Jared Hoberock,et al. Edge v. Node Parallelism for Graph Centrality Metrics , 2012 .

[47] Matei Ripeanu,et al. A yoke of oxen and a thousand chickens for heavy lifting graph processing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[48] Kunle Olukotun,et al. Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[49] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[50] Willy Zwaenepoel,et al. X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[51] Jianlong Zhong,et al. Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[52] Michela Becchi,et al. Deploying Graph Algorithms on GPUs: An Adaptive Solution , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[53] Keshav Pingali,et al. Morph algorithms on GPUs , 2013, PPoPP '13.

[54] Panos Kalnis,et al. Mizan: a system for dynamic load balancing in large-scale graph processing , 2013, EuroSys '13.

[55] Jinha Kim,et al. TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC , 2013, KDD.

[56] Keval Vora,et al. CuSha: vertex-centric graph processing on GPUs , 2014, HPDC '14.

[57] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[58] Bin Li,et al. Distributed cooperative localization based on Gaussian message passing on factor graph in wireless networks , 2015, Science China Information Sciences.

[59] Bingsheng He,et al. In-Cache Query Co-Processing on Coupled CPU-GPU Architectures , 2014, Proc. VLDB Endow..

[60] Shengen Yan,et al. yaSpMV: yet another SpMV framework on GPUs , 2014, PPoPP.

[61] Zhisong Fu,et al. MapGraph: A High Level API for Fast Development of High Performance Graph Analytics on GPUs , 2014, GRADES.

[62] Michael Garland,et al. Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[63] Bing Yang,et al. BiELL: A bisection ELLPACK-based storage format for optimizing SpMV on GPUs , 2014, J. Parallel Distributed Comput..

[64] Srinivasan Parthasarathy,et al. Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[65] Jianlong Zhong,et al. Medusa: Simplified Graph Processing on GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.

[66] Zhenguo Li,et al. VENUS: Vertex-centric streamlined graph computation on a single PC , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[67] Sudipto Guha,et al. Vertex and Hyperedge Connectivity in Dynamic Graph Streams , 2015, PODS.

[68] Karsten Schwan,et al. GraphReduce: processing large-scale graphs on accelerator-based systems , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[69] H. Howie Huang,et al. Enterprise: breadth-first graph traversal on GPUs , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[70] Kenli Li,et al. Performance Analysis and Optimization for SpMV on GPU Using Probabilistic Modeling , 2015, IEEE Transactions on Parallel and Distributed Systems.

[71] Hai Jin,et al. Optimization of asynchronous graph processing on GPU with hybrid coloring model , 2015, PPoPP.

[72] John D. Owens,et al. Gunrock: a high-performance graph processing library on the GPU , 2015, PPoPP.

[73] Wenguang Chen,et al. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.

[74] Jinwook Kim,et al. GStream: a graph streaming processing method for large-scale graphs on GPUs , 2015, PPoPP.

[75] Michael I. Jordan,et al. Machine learning: Trends, perspectives, and prospects , 2015, Science.

[76] Alexandros G. Dimakis,et al. FrogWild! - Fast PageRank Approximations on Graph Engines , 2015, Proc. VLDB Endow..

[77] Ariful Azad,et al. A Parallel Tree Grafting Algorithm for Maximum Cardinality Matching in Bipartite Graphs , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[78] Scott McMillan,et al. GBTL-CUDA: Graph Algorithms and Primitives for GPUs , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[79] H. Howie Huang,et al. iBFS: Concurrent Breadth-First Search on GPUs , 2016, SIGMOD Conference.

[80] Feng Shi,et al. Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[81] Jinwook Kim,et al. GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs , 2016, SIGMOD Conference.

[82] Wenguang Chen,et al. FinePar: Irregularity-aware fine-grained workload partitioning on integrated architectures , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[83] Davide Barbieri,et al. Sparse Matrix-Vector Multiplication on GPGPUs , 2017, ACM Trans. Math. Softw..