Subgraph covers - An information theoretic approach to motif analysis in networks

Many real world networks contain a statistically surprising number of certain subgraphs, called network motifs. In the prevalent approach to motif analysis, network motifs are detected by comparing subgraph frequencies in the original network with a statistical null model. In this paper we propose an alternative approach to motif analysis where network motifs are defined to be connectivity patterns that occur in a subgraph cover that represents the network using minimal total information. A subgraph cover is defined to be a set of subgraphs such that every edge of the graph is contained in at least one of the subgraphs in the cover. Some recently introduced random graph models that can incorporate significant densities of motifs have natural formulations in terms of subgraph covers and the presented approach can be used to match networks with such models. To prove the practical value of our approach we also present a heuristic for the resulting NP-hard optimization problem and give results for several real world networks.

[1]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Cohen,et al.  Resilience of the internet to random breakdowns , 2000, Physical review letters.

[3]  A. D. Jackson,et al.  Citation networks in high energy physics. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[5]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[6]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[7]  Thilo Gross,et al.  Engineering mesoscale structures with distinct dynamical implications , 2012, New Journal of Physics.

[8]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[9]  Jean-Loup Guillaume,et al.  Bipartite structure of all complex networks , 2004, Inf. Process. Lett..

[10]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[11]  Ravi B. Boppana,et al.  Approximating maximum independent sets by excluding subgraphs , 1990, BIT.

[12]  Noga Alon,et al.  Algorithmic construction of sets for k-restrictions , 2006, TALG.

[13]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[14]  Sarel J Fleishman,et al.  Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[15]  Eckehard Olbrich,et al.  Quantifying structure in networks , 2009, 0912.4450.

[16]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[17]  Mark E. J. Newman,et al.  Random graphs containing arbitrary distributions of subgraphs , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[19]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[20]  R. Milo,et al.  Subgraphs in random networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[22]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[23]  M. Newman,et al.  On the uniform generation of random graphs with prescribed degree sequences , 2003, cond-mat/0312028.

[24]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[25]  Béla Bollobás,et al.  The phase transition in inhomogeneous random graphs , 2007, Random Struct. Algorithms.

[26]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[27]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[28]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[29]  K. Al-Sultan,et al.  A Genetic Algorithm for the Set Covering Problem , 1996 .

[30]  S. Brenner,et al.  The neural circuit for touch sensitivity in Caenorhabditis elegans , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[31]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Anirban Banerjee,et al.  Spectral Characterization of Network Structures and Dynamics , 2009 .

[33]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[34]  Adolfo Piperno,et al.  Search Space Contraction in Canonical Labeling of Graphs (Preliminary Version) , 2008, ArXiv.

[35]  Ginestra Bianconi,et al.  Entropy measures for networks: toward an information theory of complex topologies. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  F. Atay,et al.  Network synchronization: Spectral versus statistical properties , 2006, 0706.3069.

[37]  R. Schack Algorithmic information and simplicity in statistical physics , 1994 .

[38]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[39]  Jaikumar Radhakrishnan,et al.  Greed is good: Approximating independent sets in sparse and bounded-degree graphs , 1997, Algorithmica.

[40]  Stefan Bornholdt,et al.  Handbook of Graphs and Networks: From the Genome to the Internet , 2003 .

[41]  David Saad,et al.  The Interplay between Microscopic and Mesoscopic Structures in Complex Networks , 2010, PloS one.

[42]  R. May Food webs. , 1983, Science.

[43]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[44]  Murray Gell-Mann,et al.  What is complexity? Remarks on simplicity and complexity by the Nobel Prize-winning author of The Quark and the Jaguar , 1995, Complex..

[45]  Seth Lloyd,et al.  Information measures, effective complexity, and total information , 1996, Complex..

[46]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[47]  P. Diaconis,et al.  Estimating and understanding exponential random graph models , 2011, 1102.2650.

[48]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[49]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[50]  Von der Fakultat Subgraph Covers- an Information Theoretic Approach to Motif Analysis in Networks , 2015 .

[51]  Peter Grünwald,et al.  Invited review of the book Statistical and Inductive Inference by Minimum Message Length , 2006 .

[52]  Frank Harary,et al.  Graphical enumeration , 1973 .

[53]  C. S. Wallace,et al.  Statistical and Inductive Inference by Minimum Message Length (Information Science and Statistics) , 2005 .

[54]  Bart Deplancke,et al.  Gene Regulatory Networks , 2012, Methods in Molecular Biology.

[55]  Sahar Asadi,et al.  Kavosh: a new algorithm for finding network motifs , 2009, BMC Bioinformatics.

[56]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[57]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[58]  Susanna C. Manrubia,et al.  STATISTICAL PROPERTIES OF GENEALOGICAL TREES , 1999, cond-mat/9902033.

[59]  Marcus Kaiser,et al.  Strategies for Network Motifs Discovery , 2009, 2009 Fifth IEEE International Conference on e-Science.

[60]  Fan Chung Graham,et al.  The Spectra of Random Graphs with Given Expected Degrees , 2004, Internet Math..

[61]  Anatol E. Wegner,et al.  Motif Conservation Laws for the Configuration Model , 2014, ArXiv.

[62]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[63]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[64]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[65]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[66]  Juyong Park,et al.  Solution for the properties of a clustered network. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[67]  Loet Leydesdorff,et al.  Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment , 2006, J. Assoc. Inf. Sci. Technol..

[68]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[69]  Béla Bollobás,et al.  Sparse random graphs with clustering , 2008, Random Struct. Algorithms.

[70]  FoggiaPasquale,et al.  A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs , 2004 .

[71]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[72]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[73]  Lawrence Davis,et al.  Genetic Algorithms and Simulated Annealing , 1987 .

[74]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[75]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[76]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[77]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[78]  S. N. Dorogovtsev,et al.  Spectra of complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[79]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[80]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[81]  D. Corneil,et al.  An Efficient Algorithm for Graph Isomorphism , 1970, JACM.

[82]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[83]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[84]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[85]  Marc Barthelemy,et al.  Spatial Networks , 2010, Encyclopedia of Social Network Analysis and Mining.

[86]  Anirban Banerjee,et al.  Graph spectra as a systematic tool in computational biology , 2007, Discret. Appl. Math..

[87]  F. Chung,et al.  The average distances in random graphs with given expected degrees , 2002, Proceedings of the National Academy of Sciences of the United States of America.