A review on models and algorithms for motif discovery in protein-protein interaction networks.

Several algorithms have been recently designed to identify motifs in biological networks, particularly in protein-protein interaction networks. Motifs correspond to repeated modules in the network that may be of biological interest. The approaches proposed in the literature often differ in the definition of a motif, the way the occurrences of a motif are counted and the way their statistical significance is assessed. This has strong implications on the computational complexity of the discovery process and on the type of results that can be expected. This review presents in a systematic way the different computational settings outlining their main features and limitations.

[1]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[2]  Maya Paczuski,et al.  Subgraph ensembles and motif discovery using an alternative heuristic for graph isomorphism. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Sarel J Fleishman,et al.  Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[4]  Falk Schreiber,et al.  Frequency Concepts and Pattern Detection for the Analysis of Motifs in Networks , 2005, Trans. Comp. Sys. Biology.

[5]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[6]  Alexander M. Millkey The Black Swan: The Impact of the Highly Improbable , 2009 .

[7]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[8]  Igor Jurisica,et al.  Efficient estimation of graphlet frequency distributions in protein-protein interaction networks , 2006, Bioinform..

[9]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[10]  Mong-Li Lee,et al.  Labeling network motifs in protein interactomes for protein function prediction , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Albert-László Barabási,et al.  Linked: The New Science of Networks , 2002 .

[12]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[13]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[14]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[15]  Petter Holme,et al.  Structure and time evolution of an Internet dating community , 2002, Soc. Networks.

[16]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[17]  J. Daudin,et al.  Uncovering structure in biological networks , 2006 .

[18]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[19]  Mathias R. Kuhnt,et al.  Impact of observational incompleteness on the structural properties of protein interaction networks , 2006, q-bio/0605033.

[20]  E. Ziv,et al.  Inferring network mechanisms: the Drosophila melanogaster protein interaction network. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  M. Vergassola,et al.  An evolutionary and functional assessment of regulatory network motifs , 2005, Genome Biology.

[22]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[23]  Aaron Kershenbaum,et al.  Lasting impressions: motifs in protein-protein maps may provide footprints of evolutionary events. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[26]  Uri Alon,et al.  Kashtan, N., Itzkovitz, S., Milo, R. & Alon, U. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 1746-1758 , 2004 .

[27]  Sebastian Wernicke,et al.  Efficient Detection of Network Motifs , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Matteo Comin,et al.  Bridging Lossy and Lossless Compression by Motif Pattern Discovery , 2006, GTIT-C.

[29]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[30]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[31]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[32]  S. Strogatz Exploring complex networks , 2001, Nature.

[33]  I. Ispolatov,et al.  Cliques and duplication–divergence network growth , 2005, New journal of physics.

[34]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[35]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[36]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[37]  Trevor Hastie,et al.  The elements of statistical learning. 2001 , 2001 .

[38]  Derek G. Corneil,et al.  The graph isomorphism disease , 1977, J. Graph Theory.

[39]  M. Newman,et al.  On the uniform generation of random graphs with prescribed degree sequences , 2003, cond-mat/0312028.

[40]  Michael Lässig,et al.  Local graph alignment and motif search in biological networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[42]  A. Vázquez Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[44]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[45]  R. Milo,et al.  Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[47]  Uri Alon,et al.  Response to Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[48]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[49]  Laxmi Parida,et al.  Discovering Topological Motifs Using a Compact Notation , 2007, J. Comput. Biol..

[50]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[52]  Martin Stetter,et al.  Noisy scale-free networks , 2005 .

[53]  I. Ispolatov,et al.  Duplication-divergence model of protein interaction network. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[54]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[55]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[56]  Byungkyu Brian Park,et al.  HPID: The Human Protein Interaction Database , 2004, Bioinform..

[57]  Manuel Middendorf,et al.  Systematic identification of statistically significant network measures. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[58]  A. Arkin,et al.  Biological networks. , 2003, Current opinion in structural biology.

[59]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[60]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 2000, Nucleic Acids Res..

[61]  B. Bollobás The evolution of random graphs , 1984 .

[62]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[63]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[64]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[65]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Paul Erdös,et al.  On random graphs, I , 1959 .

[67]  Rudolf Ahlswede,et al.  General Theory of Information Transfer and Combinatorics , 2006, GTIT-C.

[68]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[69]  S. L. Wong,et al.  Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network , 2005, Journal of biology.

[70]  Alessandro Vespignani,et al.  Evolution thinks modular , 2003, Nature Genetics.

[71]  John Scott Social Network Analysis , 1988 .

[72]  Albert-László Barabási,et al.  Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network , 2004, BMC Bioinformatics.

[73]  Mong-Li Lee,et al.  NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs , 2006, KDD '06.