Combining HTM with RCU to Speed Up Graph Coloring on Multicore Platforms

Graph algorithms are hard to parallelize, as they exhibit varying degrees of parallelism and perform irregular memory accesses. Graph coloring is a well studied problem, that colors the vertices of a graph, such that no adjacent vertices have the same color. This is a necessity for a large number of applications that require a coloring with few colors in near-linear time. In this work, we propose a simple and fast parallel graph coloring algorithm, well suited for shared memory architectures. Our algorithm employs Hardware Transactional Memory (HTM) to detect coloring inconsistencies between adjacent vertices, and exploits Read-Copy-Update (RCU) to enable high performance and ensure correctness.

[1]  John Cocke,et al.  A methodology for the real world , 1981 .

[2]  Charles E. Leiserson,et al.  Ordering heuristics for parallel graph coloring , 2014, SPAA.

[3]  Paul H. J. Kelly,et al.  A Fast and Scalable Graph Coloring Algorithm for Multi-core and Many-core Architectures , 2015, Euro-Par.

[4]  Ümit V. Çatalyürek,et al.  Graph coloring algorithms for multi-core and massively multithreaded architectures , 2012, Parallel Comput..

[5]  Nectarios Koziris,et al.  RCU-HTM: Combining RCU with HTM to Implement Highly Efficient Concurrent Binary Search Trees , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Ümit V. Çatalyürek,et al.  A Scalable Parallel Graph Coloring Algorithm for Distributed Memory Computers , 2005, Euro-Par.

[7]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[8]  J. J. Moré,et al.  Estimation of sparse jacobian matrices and graph coloring problems , 1983 .

[9]  Christopher J. Hughes,et al.  Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Youcef Saad,et al.  A Basic Tool Kit for Sparse Matrix Computations , 1990 .

[11]  Hagit Attiya,et al.  Concurrent updates with RCU: search tree as an example , 2014, PODC '14.

[12]  Assefaw H. Gebremedhin,et al.  Scalable parallel graph coloring algorithms , 2000 .

[13]  D. J. A. Welsh,et al.  An upper bound for the chromatic number of a graph and its application to timetabling problems , 1967, Comput. J..

[14]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[15]  Mark T. Jones,et al.  A Parallel Graph Coloring Heuristic , 1993, SIAM J. Sci. Comput..

[16]  Paul E. McKenney,et al.  READ-COPY UPDATE: USING EXECUTION HISTORY TO SOLVE CONCURRENCY PROBLEMS , 2002 .

[17]  Charles E. Leiserson,et al.  Executing dynamic data-graph computations deterministically using chromatic scheduling , 2014, SPAA.

[18]  M. Frans Kaashoek,et al.  Scalable address spaces using RCU balanced trees , 2012, ASPLOS XVII.

[19]  David A. Bader,et al.  An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs , 2009, PPoPP '09.

[20]  Victor Luchangco,et al.  Investigating the Performance of Hardware Transactions on a Multi-Socket Machine , 2016, SPAA.

[21]  Mehmet Deveci,et al.  Parallel Graph Coloring for Manycore Architectures , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[22]  Nir Shavit,et al.  Read-log-update: a lightweight synchronization mechanism for concurrent programming , 2015, SOSP.

[23]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[24]  Nectarios Koziris,et al.  Employing Transactional Memory and Helper Threads to Speedup Dijkstra's Algorithm , 2009, 2009 International Conference on Parallel Processing.

[25]  Maged M. Michael,et al.  Robust architectural support for transactional memory in the power architecture , 2013, ISCA.