Experimental Evaluation of Counting Subgraph Isomorphisms in Classes of Bounded Expansion

Counting subgraph isomorphisms (also called motifs or graphlets) has been used extensively as a tool for analyzing biological and social networks. Under standard complexity assumptions there is no polynomial time algorithm for this problem, which limits the applicability of these tools to large data sets. Recent techniques from parameterized complexity have led to an algorithmic framework for isomorphism counting whose worst case time complexity is linear in the number of vertices, provided that the input graph has certain structural characteristics, known as bounded expansion. Previous work has suggested that the restrictions of bounded expansion structure--locally dense pockets in a globally sparse graph--naturally coincide with common properties of real-world networks such as clustering and heavy-tailed degree distributions. However, there has been little work done in implementing and evaluating the performance of this algorithmic pipeline. To this end we introduced CONCUSS, an open-source software package for counting subgraph isomorphisms in classes of bounded expansion. Through a broad set of experiments we evaluate implementations of multiple stages of the pipeline and demonstrate that our structure-based algorithm can be up to an order of magnitude faster than a popular algorithm for isomorphism counting.

[1]  Daniel Brélaz,et al.  New methods to color the vertices of a graph , 1979, CACM.

[2]  Blair D. Sullivan,et al.  Structural Sparsity of Complex Networks: Random Graph Models and Linear Algorithms , 2014, ArXiv.

[3]  Felix Reidl,et al.  Structural sparseness and complex networks , 2016 .

[4]  P. Foggia,et al.  Performance evaluation of the VF graph matching algorithm , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[5]  Ambuj K. Singh,et al.  Graphs-at-a-time: query language and access methods for graph databases , 2008, SIGMOD Conference.

[6]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[8]  Jaroslav Nesetril,et al.  Sparsity - Graphs, Structures, and Algorithms , 2012, Algorithms and combinatorics.

[9]  Mark Jerrum,et al.  The Parameterised Complexity of Counting Connected Subgraphs , 2013, ArXiv.

[10]  Felix Reidl,et al.  Characterising Bounded Expansion by Neighbourhood Complexity , 2016, Eur. J. Comb..

[11]  Weng Leong Optimal network Alignment with Graphlet Degree Vectors , 2010 .

[12]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tang,et al.  Self-Organized Criticality: An Explanation of 1/f Noise , 2011 .

[14]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[15]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[16]  Robin Thomas,et al.  Testing first-order properties for subclasses of sparse graphs , 2011, JACM.

[17]  Daniel Král,et al.  Algorithms for Classes of Graphs with Bounded Expansion , 2009, WG.

[18]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[19]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[20]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[21]  Tijana Milenkoviæ,et al.  Uncovering Biological Network Function via Graphlet Degree Signatures , 2008, Cancer informatics.

[22]  Jianzhong Li,et al.  Efficient Subgraph Matching on Billion Node Graphs , 2012, Proc. VLDB Endow..

[23]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..