Graph Coloring on the GPU

We design and implement parallel graph coloring algorithms on the GPU using two different abstractions—one data-centric (Gunrock), the other linear-algebra-based (GraphBLAS). We analyze the impact of variations of a baseline independent-set algorithm on quality and runtime. We study how optimizations such as hashing, avoiding atomics, and a max-min heuristic affect performance. Our Gunrock graph coloring implementation has a peak 2x speed-up, a geomean speed-up of 1.3x and produces 1.6x more colors over previous hardwired state-of-the-art implementations on real-world datasets. Our GraphBLAS implementation of Luby's algorithm produces 1.9x fewer colors than the previous state-of-the-art parallel implementation at the cost of 3x extra runtime, and 1.014x fewer colors than a greedy, sequential algorithm with a geomean speed-up of 2.6x.

[1]  Mark T. Jones,et al.  A Parallel Graph Coloring Heuristic , 1993, SIAM J. Sci. Comput..

[2]  Charles E. Leiserson,et al.  Executing dynamic data-graph computations deterministically using chromatic scheduling , 2014, SPAA.

[3]  Fusun Akman Partial Chromatic Polynomials and Diagonally Distinct Sudoku Squares , 2008 .

[4]  Scott McMillan,et al.  GraphBLAS C API: Ideas for future versions of the specification , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[5]  John R. Gilbert,et al.  The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..

[6]  John Cocke,et al.  Register Allocation Via Coloring , 1981, Comput. Lang..

[7]  Ümit V. Çatalyürek,et al.  A framework for scalable greedy coloring on distributed-memory parallel computers , 2008, J. Parallel Distributed Comput..

[8]  John D. Owens,et al.  Implementing Push-Pull Efficiently in GraphBLAS , 2018, ICPP.

[9]  Peter Sanders,et al.  Engineering a scalable high quality graph partitioner , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[10]  Bradford M. Beckmann,et al.  Graph Coloring on the GPU and Some Techniques to Improve Load Imbalance , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[11]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[12]  Alex Pothen,et al.  What Color Is Your Jacobian? Graph Coloring for Computing Derivatives , 2005, SIAM Rev..

[13]  John Cocke,et al.  A methodology for the real world , 1981 .

[14]  Mehmet Deveci,et al.  Parallel Graph Coloring for Manycore Architectures , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[15]  José E. Moreira,et al.  Graph programming interface (GPI): a linear algebra programming model for large scale graph computations , 2016, Conf. Computing Frontiers.

[16]  Assefaw H. Gebremedhin,et al.  Scalable parallel graph coloring algorithms , 2000 .

[17]  Vahid Lotfi,et al.  A graph coloring algorithm for large scale scheduling problems , 1986, Comput. Oper. Res..

[18]  Thomas F. Coleman,et al.  Estimation of sparse hessian matrices and graph coloring problems , 1982, Math. Program..

[19]  Jonathan M. Cohen,et al.  Parallel Graph Coloring with Applications to the Incomplete-LU Factorization on the GPU , 2015 .

[20]  John D. Owens,et al.  Gunrock , 2017, ACM Trans. Parallel Comput..

[21]  Barry Smith,et al.  Sparse Matrix-Matrix Products Executed Through Coloring , 2015, SIAM J. Matrix Anal. Appl..

[22]  Yousef Saad,et al.  ILUM: A Multi-Elimination ILU Preconditioner for General Sparse Matrices , 1996, SIAM J. Sci. Comput..

[23]  Mark T. Jones,et al.  Scalable Iterative Solution of Sparse Linear Systems , 1994, Parallel Comput..

[24]  Jennifer Widom,et al.  Optimizing Graph Algorithms on Pregel-like Systems , 2014, Proc. VLDB Endow..

[25]  Pradeep Dubey,et al.  GraphMat: High performance graph analytics made productive , 2015, Proc. VLDB Endow..

[26]  Scott McMillan,et al.  Design of the GraphBLAS API for C , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[27]  Kivanc Dincer,et al.  A Comparison of Parallel Graph Coloring Algorithms , 1995 .

[28]  Suresh Venkatasubramanian,et al.  Evaluating graph coloring on GPUs , 2011, PPoPP '11.

[29]  Charles E. Leiserson,et al.  Ordering heuristics for parallel graph coloring , 2014, SPAA.