Biological Network Querying Techniques: Analysis and Comparison

Research in systems biology has made available large amounts of data about interactions among cell building blocks (e.g., proteins, genes). To properly look up these data and mine useful information, the design and development of automatic tools has become crucial. These tools leverage Biological Networks as a formal model to encode molecular interactions. Biological networks can be fed as input to graph-based techniques useful to infer new information about cellular activity and evolutive processes of the species. In this context, a rather interesting family of techniques is that of network querying. Network querying tools search a whole biological network to identify conserved occurrences of a given query module for transferring biological knowledge. Indeed, inasmuch as the query network generally encodes a well-characterized functional module, its occurrences in the queried network suggest that the latter (and, as such, the corresponding organism) features the function encoded by the former. The aim of this paper is that of analyzing and comparing tools devised to query biological networks. This analysis is intended to help in understanding problems and research issues, state of the art and opportunities for researchers working in this area.

[1]  Peter D. Karp,et al.  EcoCyc: A comprehensive view of Escherichia coli biology , 2008, Nucleic Acids Res..

[2]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[3]  A. Wagner,et al.  Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications , 2002, BMC Evolutionary Biology.

[4]  Sebastian Wernicke,et al.  Simple and Fast Alignment of Metabolic Pathways by Exploiting Local Diversity , 2007, APBC.

[5]  Antje Chang,et al.  BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009 , 2008, Nucleic Acids Res..

[6]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[7]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[8]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[9]  Roded Sharan,et al.  NetworkBLAST: comparative analysis of protein networks , 2008 .

[10]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[11]  Michael B. Yaffe,et al.  Data-driven modelling of signal-transduction networks , 2006, Nature Reviews Molecular Cell Biology.

[12]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[13]  Alain Viari,et al.  GenoLink: a graph-based querying and browsing system for investigating the function of genes and proteins , 2006, BMC Bioinformatics.

[14]  T. Ideker,et al.  Systematic interpretation of genetic interactions using protein networks , 2005, Nature Biotechnology.

[15]  Xiaoning Qian,et al.  Querying Pathways in Protein Interaction Networks Based on Hidden Markov Models , 2009, J. Comput. Biol..

[16]  Dennis Shasha,et al.  NetMatch : a Cytoscape plugin for searching biological networks , 2006 .

[17]  I. Dawid,et al.  Dishevelled and Wnt signaling: is the nucleus the final frontier? , 2005, Journal of biology.

[18]  The FlyBase database of the Drosophila genome projects and community literature. , 2003, Nucleic acids research.

[19]  Mam Riess Jones Color Coding , 1962, Human factors.

[20]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[21]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Paul D. Seymour,et al.  Graph Minors: XV. Giant Steps , 1996, J. Comb. Theory, Ser. B.

[23]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[24]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[25]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[26]  Paul D. Seymour,et al.  Graph Minors. II. Algorithmic Aspects of Tree-Width , 1986, J. Algorithms.

[27]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[28]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[29]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[30]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[31]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[32]  S. L. Wong,et al.  Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network , 2005, Journal of biology.

[33]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[34]  Sing-Hoi Sze,et al.  Path Matching and Graph Matching in Biological Networks , 2007, J. Comput. Biol..

[35]  Jignesh M. Patel,et al.  SAGA: a subgraph matching tool for biological graphs , 2007, Bioinform..

[36]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[37]  Amos Bairoch,et al.  Swiss-Prot: Juggling between evolution and stability , 2004, Briefings Bioinform..

[38]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[39]  Shi-Hua Zhang,et al.  Biomolecular network querying: a promising approach in systems biology , 2008, BMC Systems Biology.

[40]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[41]  Thomas L. Madden,et al.  BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. , 1999, FEMS microbiology letters.

[42]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[44]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[45]  B. A. Reed,et al.  Algorithmic Aspects of Tree Width , 2003 .

[46]  Luigi Palopoli,et al.  Protein-Protein Interaction Network Querying by a "Focus and Zoom" Approach , 2008, BIRD.

[47]  Hideo Matsuda,et al.  A Multiple Alignment Algorithm for Metabolic Pathway Analysis Using Enzyme Hierarchy , 2000, ISMB.

[48]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[49]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[50]  Roded Sharan,et al.  Topology-Free Querying of Protein Interaction Networks , 2009, RECOMB.