A Parallel Algorithm for Counting Subgraphs in Complex Networks

Many natural and artificial structures can be represented as complex networks. Computing the frequency of all subgraphs of a certain size can give a very comprehensive structural characterization of these networks. This is known as the subgraph census problem, and it is also important as an intermediate step in the computation of other features of the network, such as network motifs. The subgraph census problem is computationally hard and most associated algorithms for it are sequential. Here we present several increasingly efficient parallel strategies for, culminating in a scalable and adaptive parallel algorithm. We applied our strategies to a representative set of biological networks and achieved almost linear speedups up to 128 processors, paving the way for making it possible to compute the census for bigger networks and larger subgraph sizes.

[1]  Réka Albert,et al.  Conserved network motifs allow protein-protein interaction prediction , 2004, Bioinform..

[2]  Srinivasan Parthasarathy,et al.  Parallel algorithms for mining frequent structural motifs in scientific data , 2004, ICS '04.

[3]  Katherine Faust,et al.  7. Very Local Structure in Social Networks , 2007 .

[4]  Stéphane Robin,et al.  Network motifs : mean and variance for the count , 2006 .

[5]  Elliott Cooper-Balis,et al.  Parallel Network Motif Finding , 2007 .

[6]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[7]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[8]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[10]  E. Ziv,et al.  Inferring network mechanisms: the Drosophila melanogaster protein interaction network. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[12]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[13]  Leslie A. Hall,et al.  Approximation algorithms for scheduling , 1996 .

[14]  Uri Alon,et al.  Coarse-graining and self-dissimilarity of complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  O. Sporns,et al.  Motifs in Brain Networks , 2004, PLoS biology.

[16]  Edward B. Suh,et al.  A parallel algorithm for extracting transcriptional regulatory network motifs , 2005, Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05).

[17]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[18]  Sebastian Wernicke,et al.  Efficient Detection of Network Motifs , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Falk Schreiber,et al.  Towards Motif Detection in Networks: Frequency Concepts and Flexible Search , 2004 .

[20]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[21]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[22]  Aristides Gionis,et al.  Mining Large Networks with Subgraph Counting , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[23]  Joost N. Kok,et al.  Frequent graph mining and its application to molecular databases , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[24]  Franck Picard,et al.  Assessing the Exceptionality of Network Motifs , 2007, J. Comput. Biol..

[25]  Katherine Faust,et al.  Very Local Structure in Social Networks , 2006 .