Searching for repetitions in biological networks: methods, resources and tools

We present here a compact overview of the data, models and methods proposed for the analysis of biological networks based on the search for significant repetitions. In particular, we concentrate on three problems widely studied in the literature: 'network alignment', 'network querying' and 'network motif extraction'. We provide (i) details of the experimental techniques used to obtain the main types of interaction data, (ii) descriptions of the models and approaches introduced to solve such problems and (iii) pointers to both the available databases and software tools. The intent is to lay out a useful roadmap for identifying suitable strategies to analyse cellular data, possibly based on the joint use of different interaction data types or analysis techniques.

[1]  Behnam Neyshabur,et al.  NETAL: a new graph-based method for global alignment of protein-protein interaction networks , 2013, Bioinform..

[2]  Alessia Amelio,et al.  Image Compression by 2D Motif Basis , 2011, 2011 Data Compression Conference.

[3]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Alberto Apostolico,et al.  Incremental Paradigms of Motif Discovery , 2004, J. Comput. Biol..

[5]  L. Castagnoli,et al.  mentha: a resource for browsing integrated protein-interaction networks , 2013, Nature Methods.

[6]  S. Batzoglou,et al.  Genome-Wide Analysis of Transcription Factor Binding Sites Based on ChIP-Seq Data , 2008, Nature Methods.

[7]  Luigi Palopoli,et al.  Biological Network Querying Techniques: Analysis and Comparison , 2011, J. Comput. Biol..

[8]  Byung-Jun Yoon,et al.  SMETANA: Accurate and Scalable Algorithm for Probabilistic Alignment of Large-Scale Biological Networks , 2013, PloS one.

[9]  Aleksandar Stevanovic,et al.  GraphCrunch 2: Software tool for network modeling, alignment and clustering , 2011, BMC Bioinformatics.

[10]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[11]  Frédéric Boyer,et al.  Multiple Alignment of Biological Networks: A Flexible Approach , 2009, CPM.

[12]  Marc Vidal,et al.  Yeast Two-hybrid Systems and Protein Interaction Mapping Projects for Yeast and Worm , 2022 .

[13]  Roded Sharan,et al.  Global alignment of protein-protein interaction networks. , 2013, Methods in molecular biology.

[14]  Yang Zhang,et al.  EvoDesign: de novo protein design based on structural and evolutionary profiles , 2013, Nucleic Acids Res..

[15]  D. Posada Bioinformatics for DNA Sequence Analysis , 2009, Methods in Molecular Biology.

[16]  Tamer Kahveci,et al.  Metabolic network alignment in large scale by network compression , 2012, BMC Bioinformatics.

[17]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[18]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[19]  Luigi Palopoli,et al.  Improving protein secondary structure predictions by prediction fusion , 2009, Inf. Fusion.

[20]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[21]  Tamer Kahveci,et al.  SubMAP: Aligning Metabolic Pathways with Subnetwork Mappings , 2010, RECOMB.

[22]  U Alon,et al.  The incoherent feed-forward loop accelerates the response-time of the gal system of Escherichia coli. , 2006, Journal of molecular biology.

[23]  L. Castagnoli,et al.  Protein Interaction Networks by Proteome Peptide Scanning , 2004, PLoS biology.

[24]  Angelo Furfaro,et al.  Image Classification Based on 2D Feature Motifs , 2013, FQAS.

[25]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[26]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[27]  Dennis Shasha,et al.  NetMatch : a Cytoscape plugin for searching biological networks , 2006 .

[28]  Byung-Jun Yoon,et al.  RESQUE: Network reduction using semi-Markov random walk scores for efficient querying of biological networks , 2012, Bioinform..

[29]  Albert-László Barabási,et al.  Transcription factor modularity in a gene-centered C. elegans core neuronal protein-DNA interaction network. , 2007, Genome research.

[30]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[31]  Ugur Sahin,et al.  RNA-Seq Atlas - a reference database for gene expression profiling in normal tissue by next-generation sequencing , 2012, Bioinform..

[32]  S. Fields,et al.  Elimination of false positives that arise in using the two-hybrid system. , 1993, BioTechniques.

[33]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[34]  David E Hill,et al.  Yeast one-hybrid assays for gene-centered human gene regulatory network mapping , 2011, Nature Methods.

[35]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[36]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[37]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[38]  Alexander R. Pico,et al.  GenMAPP 2: new features and resources for pathway analysis , 2007, BMC Bioinformatics.

[39]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[40]  Bonnie Berger,et al.  Global Alignment of Multiple Protein Interaction Networks , 2008, Pacific Symposium on Biocomputing.

[41]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[42]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[43]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[44]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[45]  Simona E. Rombo Extracting string motif bases for quorum higher than two , 2012, Theor. Comput. Sci..

[46]  Cheng-Yu Ma,et al.  Optimizing a global alignment of protein interaction networks , 2013, Bioinform..

[47]  William Stafford Noble,et al.  Large-scale identification of yeast integral membrane protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[48]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[49]  Laxmi Parida,et al.  Irredundant tandem motifs , 2014, Theor. Comput. Sci..

[50]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[51]  Anne Morgat,et al.  UniPathway: a resource for the exploration and annotation of metabolic pathways , 2011, Nucleic Acids Res..

[52]  Ville Mustonen,et al.  GraphAlignment: Bayesian pairwise alignment of biological networks , 2012, BMC Systems Biology.

[53]  Marie-France Sagot,et al.  An efficient algorithm for the identification of structured motifs in DNA promoter sequences , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[54]  Katsuhiko Murakami,et al.  PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from H-Invitational protein-protein interactions integrative dataset , 2012, BMC Systems Biology.

[55]  M. Cannataro,et al.  AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology , 2012, PloS one.

[56]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[57]  Clara Pizzuti,et al.  Restricted Neighborhood Search Clustering Revisited: An Evolutionary Computation Perspective , 2013, PRIB.

[58]  Youping Deng,et al.  Recent advances in clustering methods for protein interaction networks , 2010, BMC Genomics.

[59]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[60]  Ron Shamir,et al.  PIVOT: Protein Interacions VisualizatiOn Tool , 2004, Bioinform..

[61]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[62]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[63]  Angela Re,et al.  AURA: Atlas of UTR Regulatory Activity , 2012, Bioinform..

[64]  Luigi Palopoli,et al.  Asymmetric Comparison and Querying of Biological Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[65]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[66]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[67]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[68]  Krin A. Kay,et al.  The implications of human metabolic network topology for disease comorbidity , 2008, Proceedings of the National Academy of Sciences.

[69]  Concettina Guerra,et al.  A review on models and algorithms for motif discovery in protein-protein interaction networks. , 2008, Briefings in functional genomics & proteomics.

[70]  E. Birney,et al.  High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. , 2011, Genome research.

[71]  Zhong Wang,et al.  Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.

[72]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[73]  J. Chen,et al.  HAPPI: an online database of comprehensive human annotated and predicted protein interactions , 2009, BMC Genomics.

[74]  Ahmet Emre Aladag,et al.  SPINAL: scalable protein interaction network alignment , 2013, Bioinform..

[75]  Sourav Bandyopadhyay,et al.  Systematic identification of functional orthologs based on protein network comparison. , 2006, Genome research.

[76]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[77]  Luigi Palopoli,et al.  Protein-Protein Interaction Network Querying by a "Focus and Zoom" Approach , 2008, BIRD.

[78]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[79]  Roded Sharan,et al.  Torque: topology-free querying of protein interaction networks , 2009, Nucleic Acids Res..

[80]  Luigi Palopoli,et al.  "Master-Slave" Biological Network Alignment , 2010, ISBRA.

[81]  Clara Pizzuti,et al.  A Coclustering Approach for Mining Large Protein-Protein Interaction Networks , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[82]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[83]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[84]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[85]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[86]  Chun-Hsi Huang,et al.  Biological network motif detection: principles and practice , 2012, Briefings Bioinform..

[87]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[88]  Maricel G. Kann,et al.  Protein interactions and disease: computational approaches to uncover the etiology of diseases , 2007, Briefings Bioinform..

[89]  Qifang Liu,et al.  Align human interactome with phenome to identify causative genes and networks underlying disease families , 2009, Bioinform..

[90]  S. Schuster,et al.  Metabolic network structure determines key aspects of functionality and regulation , 2002, Nature.

[91]  Bernhard O. Palsson,et al.  BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions , 2010, BMC Bioinformatics.

[92]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[93]  菊池 重郎,et al.  大阪舎密局の再発見-続-蘭人教師"ハラタマ"住宅の追跡 , 1976 .

[94]  Livia Perfetto,et al.  HuPho: the human phosphatase portal , 2012, The FEBS journal.

[95]  Fan Zhang,et al.  HPD: an online integrated human pathway database enabling systems biology studies , 2009, BMC Bioinformatics.

[96]  Chung-Yuan Huang,et al.  Mining Bridge and Brick Motifs From Complex Biological Networks for Functionally and Statistically Significant Discovery , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[97]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[98]  J. Reece-Hoyes,et al.  Yeast one-hybrid assays: a historical and technical perspective. , 2012, Methods.

[99]  Qing Yang,et al.  ITFP: an integrated platform of mammalian transcription factors , 2008, Bioinform..

[100]  Clara Pizzuti,et al.  Experimental evaluation of topological-based fitness functions to detect complexes in PPI networks , 2012, GECCO '12.

[101]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[102]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[103]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[104]  R. Milo,et al.  Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[105]  Gunnar W. Klau,et al.  A new graph-based method for pairwise global network alignment , 2009, BMC Bioinformatics.

[106]  Elena Marchiori,et al.  Complex Detection in Protein-Protein Interaction Networks: A Compact Overview for Researchers and Practitioners , 2012, EvoBIO.

[107]  Shi-Hua Zhang,et al.  Biomolecular network querying: a promising approach in systems biology , 2008, BMC Systems Biology.

[108]  Michael J. E. Sternberg,et al.  PINALOG: a novel approach to align protein interaction networks—implications for complex detection and function prediction , 2012, Bioinform..

[109]  Yi Pan,et al.  Biological network motif detection and evaluation , 2011, BMC Systems Biology.

[110]  Michael Q. Zhang,et al.  TRED: a transcriptional regulatory element database, new entries and other development , 2007, Nucleic Acids Res..

[111]  B. Séraphin,et al.  A generic protein purification method for protein complex characterization and proteome exploration , 1999, Nature Biotechnology.

[112]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[113]  Luigi Palopoli,et al.  Flexible Pattern Discovery with (Extended) Disjunctive Logic Programming , 2005, ISMIS.

[114]  Martha L. Bulyk,et al.  UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein–DNA interactions , 2010, Nucleic Acids Res..

[115]  T. Furey ChIP – seq and beyond : new and improved methodologies to detect and characterize protein – DNA interactions , 2012 .

[116]  C. Francke,et al.  Reconstructing the metabolic network of a bacterium from its genome. , 2005, Trends in microbiology.

[117]  Gary D. Bader,et al.  Bayesian Modeling of the Yeast SH3 Domain Interactome Predicts Spatiotemporal Dynamics of Endocytosis Proteins , 2009, PLoS biology.

[118]  Oliver F. Lange,et al.  Structure prediction for CASP8 with all‐atom refinement using Rosetta , 2009, Proteins.

[119]  Alberto Apostolico,et al.  Motif patterns in 2D , 2008, Theor. Comput. Sci..

[120]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[121]  Sing-Hoi Sze,et al.  Path Matching and Graph Matching in Biological Networks , 2007, J. Comput. Biol..

[122]  Louxin Zhang,et al.  Counting motifs in the human interactome , 2013, Nature Communications.

[123]  Laxmi Parida,et al.  Discovering Topological Motifs Using a Compact Notation , 2007, J. Comput. Biol..

[124]  Luigi Palopoli,et al.  A technique to search for functional similarities in protein-protein interaction networks , 2009, Int. J. Data Min. Bioinform..

[125]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[126]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[127]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[128]  Simona E. Rombo Optimal extraction of motif patterns in 2D , 2009, Inf. Process. Lett..

[129]  Aedín C. Culhane,et al.  Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments , 2011, Nucleic Acids Res..

[130]  Michael Lässig,et al.  Local graph alignment and motif search in biological networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[131]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[132]  Laxmi Parida,et al.  Characterization and Extraction of Irredundant Tandem Motifs , 2012, SPIRE.

[133]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[134]  Jignesh M. Patel,et al.  SAGA: a subgraph matching tool for biological graphs , 2007, Bioinform..

[135]  Eli Upfal,et al.  MADMX: A Strategy for Maximal Dense Motif Extraction , 2011, J. Comput. Biol..

[136]  Srinivasan Parthasarathy,et al.  Scalable global alignment for multiple biological networks , 2012, BMC Bioinformatics.

[137]  Alain Guénoche,et al.  Multifunctional proteins revealed by overlapping clustering in protein interaction network , 2011, Bioinform..

[138]  S. Mangan,et al.  Structure and function of the feed-forward loop network motif , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[139]  B. Bollobás The evolution of random graphs , 1984 .

[140]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[141]  Roded Sharan,et al.  Fast and Accurate Alignment of Multiple Protein Networks , 2009, J. Comput. Biol..

[142]  Clara Pizzuti,et al.  PINCoC : A Co-clustering Based Approach to Analyze Protein-Protein Interaction Networks , 2007, IDEAL.

[143]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[144]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[145]  Maxime Crochemore,et al.  Bases of motifs for generating repeated patterns with wild cards , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[146]  Luigi Palopoli,et al.  IP6K gene identification in plant genomes by tag searching , 2011, BMC proceedings.

[147]  B. Ason,et al.  A high-throughput assay for Tn5 Tnp-induced DNA cleavage. , 2004, Nucleic acids research.