Parallel Triangle Counting and Enumeration Using Matrix Algebra

Triangle counting and enumeration are important kernels that are used to characterize graphs. They are also used to compute important statistics such as clustering coefficients. We provide a simple exact algorithm that is based on operations on sparse adjacency matrices. By parallelizing the individual sparse matrix operations, we achieve a parallel algorithm for triangle counting. The algorithm is generalizable to triangle enumeration by modifying the semiring that underlies the matrix algebra. We present a new primitive, masked matrix multiplication, that can be beneficial especially for the enumeration case. We provide results from an initial implementation for the counting case along with various optimizations for communication reduction and load balance.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[3]  Jonathan Cohen,et al.  Graph Twiddling in a MapReduce World , 2009, Computing in Science & Engineering.

[4]  James Demmel,et al.  Communication optimal parallel multiplication of sparse random matrices , 2013, SPAA.

[5]  Tinkara Toš,et al.  Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.

[6]  Anthony K. H. Tung,et al.  On Triangulation-based Dense Neighborhood Graphs Discovery , 2010, Proc. VLDB Endow..

[7]  Julian Shun,et al.  Multicore triangle computations without tuning , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[8]  J. Gilbert Predicting Structure in Sparse Matrix Computations , 1994 .

[9]  Tamara G. Kolda,et al.  Counting Triangles in Massive Graphs with MapReduce , 2013, SIAM J. Sci. Comput..

[10]  Norishige Chiba,et al.  Arboricity and Subgraph Listing Algorithms , 1985, SIAM J. Comput..

[11]  Madhav V. Marathe,et al.  PATRIC: a parallel algorithm for counting triangles in massive networks , 2013, CIKM.

[12]  H. Avron Counting Triangles in Large Graphs using Randomized Matrix Trace Estimation , 2010 .

[13]  Pradeep Dubey,et al.  Navigating the maze of graph analytics frameworks using massive graph datasets , 2014, SIGMOD Conference.

[14]  Ying Wang,et al.  Message-Passing Algorithms for Sparse Network Alignment , 2009, TKDD.

[15]  Eli Tilevich,et al.  Efficient automated marshaling of C++ data structures for MPI applications , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[16]  John R. Gilbert,et al.  Parallel Sparse Matrix-Matrix Multiplication and Indexing: Implementation and Experiments , 2011, SIAM J. Sci. Comput..

[17]  Samuel Williams,et al.  High-Productivity and High-Performance Analysis of Filtered Semantic Graphs , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[18]  John R. Gilbert,et al.  Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..

[19]  John R. Gilbert,et al.  The Combinatorial BLAS: design, implementation, and applications , 2011, Int. J. High Perform. Comput. Appl..

[20]  M. Challacombe A general parallel sparse-blocked matrix multiply for linear scaling SCF theory , 2000 .

[21]  Ulrik Brandes,et al.  Triangle Listing Algorithms: Back from the Diversion , 2014, ALENEX.

[22]  Tamara G. Kolda,et al.  Wedge sampling for computing clustering coefficients and triangle counts on large graphs † , 2013, Stat. Anal. Data Min..