A Fast Counting Method for 6-motifs with Low Connectivity

A $k$-motif (or graphlet) is a subgraph on $k$ nodes in a graph or network. Counting of motifs in complex networks has been a well-studied problem in network analysis of various real-word graphs arising from the study of social networks and bioinformatics. In particular, the triangle counting problem has received much attention due to its significance in understanding the behavior of social networks. Similarly, subgraphs with more than 3 nodes have received much attention recently. While there have been successful methods developed on this problem, most of the existing algorithms are not scalable to large networks with millions of nodes and edges. The main contribution of this paper is a preliminary study that genaralizes the exact counting algorithm provided by Pinar, Seshadhri and Vishal to a collection of 6-motifs. This method uses the counts of motifs with smaller size to obtain the counts of 6-motifs with low connecivity, that is, containing a cut-vertex or a cut-edge. Therefore, it circumvents the combinatorial explosion that naturally arises when counting subgraphs in large networks.

[1]  Mohammad Al Hasan,et al.  Graft: An Efficient Graphlet Counting Method for Large Graph Analysis , 2014, IEEE Transactions on Knowledge and Data Engineering.

[2]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[3]  Sahar Asadi,et al.  Kavosh: a new algorithm for finding network motifs , 2009, BMC Bioinformatics.

[4]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[5]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[6]  Janez Demsar,et al.  A combinatorial approach to graphlet counting , 2014, Bioinform..

[7]  Jon M. Kleinberg,et al.  Subgraph frequencies: mapping the empirical and extremal geography of large graph collections , 2013, WWW.

[8]  Katherine Faust,et al.  A puzzle concerning triads in social networks: Graph constraints and the triad census , 2010, Soc. Networks.

[9]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[10]  Mohammad Al Hasan,et al.  GUISE: Uniform Sampling of Graphlets for Large Graph Analysis , 2012, 2012 IEEE 12th International Conference on Data Mining.

[11]  Ümit V. Çatalyürek,et al.  Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions , 2014, WWW.

[12]  Alexandros G. Dimakis,et al.  Distributed Estimation of Graph 4-Profiles , 2016, WWW.

[13]  Süleyman Cenk Sahinalp,et al.  Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution , 2006, Systems Biology and Computational Proteomics.

[14]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[15]  Ryan A. Rossi,et al.  Efficient Graphlet Counting for Large Networks , 2015, 2015 IEEE International Conference on Data Mining.

[16]  Ove Frank,et al.  Triad count statistics , 1988, Discret. Math..

[17]  P. Holland,et al.  Local Structure in Social Networks , 1976 .

[18]  Ali Pinar,et al.  Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts , 2014, WWW.

[19]  Stefano Leucci,et al.  Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling , 2019, Proc. VLDB Endow..

[20]  Mihail N. Kolountzakis,et al.  Triangle Sparsifiers , 2011, J. Graph Algorithms Appl..

[21]  Jakub W. Pachocki,et al.  Scalable Motif-aware Graph Clustering , 2016, WWW.

[22]  Madhav V. Marathe,et al.  SAHAD: Subgraph Analysis in Massive Networks Using Hadoop , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[23]  Ali Pinar,et al.  ESCAPE: Efficiently Counting All 5-Vertex Subgraphs , 2016, WWW.

[24]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[25]  Sebastian Wernicke,et al.  Efficient Detection of Network Motifs , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[26]  Alexandros G. Dimakis,et al.  Beyond Triangles: A Distributed Framework for Estimating 3-profiles of Large Graphs , 2015, KDD.

[27]  Charalampos E. Tsourakakis The K-clique Densest Subgraph Problem , 2015, WWW.

[28]  Christian Komusiewicz,et al.  Parameterized Algorithmics for Finding Connected Motifs in Biological Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Yuval Shavitt,et al.  Approximating the Number of Network Motifs , 2009, Internet Math..

[30]  Tamara G. Kolda,et al.  Fast Triangle Counting through Wedge Sampling , 2012, ArXiv.