Static graph challenge: Subgraph isomorphism

The rise of graph analytic systems has created a need for ways to measure and compare the capabilities of these systems. Graph analytics present unique scalability difficulties. The machine learning, high performance computing, and visual analytics communities have wrestled with these difficulties for decades and developed methodologies for creating challenges to move these communities forward. The proposed Subgraph Isomorphism Graph Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a graph challenge that is reflective of many real-world graph analytics processing systems. The Subgraph Isomorphism Graph Challenge is a holistic specification with multiple integrated kernels that can be run together or independently. Each kernel is well defined mathematically and can be implemented in any programming environment. Subgraph isomorphism is amenable to both vertex-centric implementations and array-based implementations (e.g., using the Graph-BLAS.org standard). The computations are simple enough that performance predictions can be made based on simple computing hardware models. The surrounding kernels provide the context for each kernel that allows rigorous definition of both the input and the output for each kernel. Furthermore, since the proposed graph challenge is scalable in both problem size and hardware, it can be used to measure and quantitatively compare a wide range of present day and future systems. Serial implementations in C++, Python, Python with Pandas, Matlab, Octave, and Julia have been implemented and their single threaded performance have been measured. Specifications, data, and software are publicly available at GraphChallenge.org.

[1]  Mauro Bisson,et al.  A CUDA implementation of the pagerank pipeline benchmark , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[2]  William Song,et al.  Streaming graph challenge: Stochastic block partition , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[3]  David A. Bader Designing Scalable Synthetic Compact Applications for Benchmarking High Productivity Computing Systems , 2006 .

[4]  Jonathan Cohen,et al.  Graph Twiddling in a MapReduce World , 2009, Computing in Science & Engineering.

[5]  Joseph P. Campbell,et al.  Testing with the YOHO CD-ROM voice verification corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jean Scholtz,et al.  A reflection on seven years of the VAST challenge , 2012, BELIV '12.

[7]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[8]  Tinkara Toš,et al.  Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.

[9]  Jonathan W. Berry,et al.  A task-based linear algebra Building Blocks approach for scalable graph analytics , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[10]  John D. Owens,et al.  A Comparative Study on Exact Triangle Counting Algorithms on the GPU , 2016, HPGP@HPDC.

[11]  R. F. Boisvert,et al.  The Matrix Market Exchange Formats: Initial Design | NIST , 1996 .

[12]  Jeremy Kepner,et al.  Using a Power Law distribution to describe big data , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[13]  Jeremy Kepner,et al.  PageRank Pipeline Benchmark: Proposal for a Holistic System Benchmark for Big-Data Platforms , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[14]  Stephen L. Olivier,et al.  Kokkos/Qthreads task-parallel approach to linear algebra based graph analytics , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[15]  Paul Burkhardt,et al.  Graphing trillions of triangles , 2016, Inf. Vis..

[16]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[17]  Christos Faloutsos,et al.  Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication , 2005, PKDD.

[18]  Georges G. Grinstein,et al.  The VAST Challenge: history, scope, and outcomes: An introduction to the Special Issue , 2014, Inf. Vis..

[19]  Jeremy Kepner,et al.  Graphulo: Linear Algebra Graph Kernels for NoSQL Databases , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[20]  Tamara G. Kolda,et al.  Community structure and scale-free collections of Erdös-Rényi graphs , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[22]  Kun-Lung Wu,et al.  Counting and Sampling Triangles from a Graph Stream , 2013, Proc. VLDB Endow..

[23]  John R. Gilbert,et al.  Parallel Triangle Counting and Enumeration Using Matrix Algebra , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.