Data Mining Techniques for the Life Sciences

This chapter gives an overview of the most commonly used biological databases of nucleic acid sequences and their structures. We cover general sequence databases, databases for specific DNA features, noncoding RNA sequences, and RNA secondary and tertiary structures.

[1]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[2]  Mark D'Souza,et al.  Use of contiguity on the chromosome to predict functional coupling , 1998, Silico Biol..

[3]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[4]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[5]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[8]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[9]  B. Snel,et al.  STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. , 2000, Nucleic acids research.

[10]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[11]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[12]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[13]  David Eisenberg,et al.  Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles , 2001, Nature Genetics.

[14]  Gary D Bader,et al.  Systematic Genetic Analysis with Ordered Arrays of Yeast Deletion Mutants , 2001, Science.

[15]  Gary D Bader,et al.  BIND--The Biomolecular Interaction Network Database. , 2001, Nucleic acids research.

[16]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[17]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[18]  M. Tyers,et al.  Osprey: a network visualization system , 2003, Genome Biology.

[19]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[20]  B. Snel,et al.  Systematic discovery of analogous enzymes in thiamin biosynthesis , 2003, Nature Biotechnology.

[21]  Eugene V Koonin,et al.  Filling a gap in the central metabolism of archaea: prediction of a novel aconitase by comparative-genomic analysis. , 2003, FEMS microbiology letters.

[22]  Edward M Marcotte,et al.  Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages , 2003, Nature Biotechnology.

[23]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[24]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[25]  Harpreet Kaur,et al.  Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. , 2004, Proteins.

[26]  S. Wuchty Evolution and topology in the yeast protein interaction network. , 2004, Genome research.

[27]  P. Bork,et al.  Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs , 2004, Nature Biotechnology.

[28]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[29]  M. Noirot-Gros,et al.  Protein interaction networks in bacteria. , 2004, Current opinion in microbiology.

[30]  Sean Ekins,et al.  A novel method for generation of signature networks as biomarkers from complex high throughput data. , 2005, Toxicology letters.

[31]  Yoshihiro Yamanishi,et al.  The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships , 2005, Bioinform..

[32]  Jane Lomax,et al.  Get ready to GO! A biologist's guide to the Gene Ontology , 2005, Briefings Bioinform..

[33]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[34]  T. Nikolskaya,et al.  Biological networks and analysis of experimental data in drug discovery. , 2005, Drug discovery today.

[35]  Igor Jurisica,et al.  Online Predicted Human Interaction Database , 2005, Bioinform..

[36]  J. Heringa,et al.  Homology-extended sequence alignment , 2005, Nucleic acids research.

[37]  Patrick Lambrix,et al.  Representations of molecular pathways: an evaluation of SBML, PSI MI and BioPAX , 2005, Bioinform..

[38]  Robert Hoffmann,et al.  HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms , 2005, BMC Bioinformatics.

[39]  Gary D. Bader,et al.  cPath: open source software for collecting, storing, and querying biological pathways , 2006, BMC Bioinformatics.

[40]  Michael P. H. Stumpf,et al.  Generating confidence intervals on biological networks , 2007, BMC Bioinformatics.

[41]  Keyuan Jiang,et al.  Application of XML Database Technology to Biological Pathway Datasets , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[42]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[43]  Anton J. Enright,et al.  Denoising inferred functional association networks obtained by gene fusion analysis , 2007, BMC Genomics.

[44]  Yoshihiro Yamanishi,et al.  Partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions , 2006, Bioinform..

[45]  Kumaran Kandasamy,et al.  An evaluation of human protein-protein interaction data in the public domain , 2006, BMC Bioinformatics.

[46]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[47]  P. Bork,et al.  Co-evolution of transcriptional and post-translational cell-cycle regulation , 2006, Nature.

[48]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[49]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[50]  Gary D. Bader,et al.  Pathguide: a Pathway Resource List , 2005, Nucleic Acids Res..

[51]  Christian von Mering,et al.  STRING 7—recent developments in the integration and prediction of protein interactions , 2006, Nucleic Acids Res..

[52]  Berend Snel,et al.  Exploration of the omics evidence landscape: adding qualitative labels to predicted protein-protein interactions , 2007, Genome Biology.

[53]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[54]  Gary D Bader,et al.  BMC Biology BioMed Central , 2007 .

[55]  Jimin Pei,et al.  PROMALS: towards accurate multiple sequence alignments of distantly related proteins , 2007, Bioinform..

[56]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[57]  R. Gentleman,et al.  Coverage and error models of protein-protein interaction data by directed graph analysis , 2007, Genome Biology.

[58]  Robert Gentleman,et al.  Making the most of high-throughput protein-interaction data , 2007, Genome Biology.

[59]  Hunter B. Fraser,et al.  Using protein complexes to predict phenotypic effects of gene mutation , 2007, Genome Biology.

[60]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[61]  Robert Gentleman,et al.  Rintact: enabling computational analysis of molecular interaction data from the IntAct repository , 2008, Bioinform..

[62]  Zoltan Dezso,et al.  Genome-wide functional synergy between amplified and mutated genes in human breast cancer. , 2008, Cancer research.

[63]  Jaap Heringa,et al.  PRALINETM: a strategy for improved multiple alignment of transmembrane proteins , 2008, Bioinform..

[64]  Chong Su,et al.  Bacteriome.org—an integrated protein interaction database for E. coli , 2007, Nucleic Acids Res..

[65]  P. Bork,et al.  Circular reasoning rather than cyclic expression , 2008, Genome Biology.

[66]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[67]  Peer Bork,et al.  KEGG Atlas mapping for global analysis of metabolic pathways , 2008, Nucleic Acids Res..

[68]  Geoffrey J. Barton,et al.  PIPs: human protein–protein interaction prediction database , 2008, Nucleic Acids Res..

[69]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[70]  Erich E. Wanker,et al.  UniHI 4: new tools for query, analysis and visualization of the human protein–protein interactome , 2008, Nucleic Acids Res..

[71]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[72]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..