Bioinformatics--principles and potential of a new multidisciplinary tool.

The materials of bioinformatics are biological data, and its methods are derived from a wide variety of computational techniques. Recent years have seen an explosive growth in biological data, and the development of novel computational methods. These methods have become essential to research progress in structural biology, genomics, structure-based drug design and molecular evolution. The development and maintenance of a robust infrastructure of biological data is of equal importance if biotechnology is to take maximum advantage of research advances in a wide variety of fields. While bioinformatics has already made important contributions, it faces significant challenges as it matures.

[1]  John O'Neill,et al.  The Genome Sequence DataBase (GSDB): meeting the challenge of genomic sequencing , 1996, Nucleic Acids Res..

[2]  Walter Gilbert,et al.  Towards a paradigm shift in biology , 1991, Nature.

[3]  Joel L Sussman,et al.  PDBBrowse — a graphics interface to the Brookhaven Protein Data Bank , 1995, Nature.

[4]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its new supplement TREMBL , 1996, Nucleic Acids Res..

[5]  Arthur M. Lesk,et al.  Three-Dimensional Searching for Recurrent Structural Motifs in Data Bases of Protein Structures , 1994, J. Comput. Biol..

[6]  David Eisenberg,et al.  Inverted protein structure prediction , 1993 .

[7]  Patricia Rodriguez-Tomé,et al.  The European Bioinformatics Institute (EBI) databases , 1994, Nucleic Acids Res..

[8]  V. McKusick Mendelian inheritance in man , 1971 .

[9]  P. Argos,et al.  A data bank merging related protein structures and sequences. , 1992, Protein engineering.

[10]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[11]  S. Altschul,et al.  Issues in searching molecular sequence databases , 1994, Nature Genetics.

[12]  R D Appel,et al.  A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server. , 1994, Trends in biochemical sciences.

[13]  Ron D. Appel,et al.  The SWISS-2DPAGE database of two-dimensional polyacrylamide gel electrophoresis, its status in 1995 , 1996, Nucleic Acids Res..

[14]  Chris Sander,et al.  The FSSP database: fold classification based on structure-structure alignment of proteins , 1996, Nucleic Acids Res..

[15]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[16]  Guy Perrière,et al.  NRSub: a non-redundant database for Bacillus subtilis , 1996, Nucleic Acids Res..

[17]  Martin J. Bishop,et al.  Guide to Human Genome Computing , 1994 .

[18]  Terri K. Attwood,et al.  Progress with the PRINTS protein fingerprint database , 1996, Nucleic Acids Res..

[19]  R Staden,et al.  The application of numerical estimates of base calling accuracy to DNA sequencing projects. , 1995, Nucleic acids research.

[20]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[21]  S Henikoff,et al.  Sequence analysis by electronic mail server. , 1993, Trends in biochemical sciences.

[22]  Declan Butler Interest ferments in yeast Genome Sequence , 1996, Nature.

[23]  D. E. Stevenson,et al.  Science, computational science, and computer science: at a crossroads , 1994, CACM.

[24]  C. Orengo Classification of protein folds , 1994 .

[25]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[26]  M S Boguski,et al.  Comparative genomics, genome cross-referencing and XREFdb. , 1995, Trends in genetics : TIG.

[27]  Sándor Suhai,et al.  Computational Methods in Genome Research , 1994, Springer US.

[28]  S. Oliver From DNA sequence to biological function , 1996, Nature.

[29]  Douglas E. Bassett,et al.  Yeast genes and human disease , 1996, Nature.

[30]  P Bork,et al.  New protein functions in yeast chromosome VIII , 1995, Protein science : a publication of the Protein Society.

[31]  Ted G. Lewis Where is computing headed? , 1994, Computer.

[32]  Richard M. Adler,et al.  Emerging Standards for Component Software , 1995, Computer.

[33]  Duncan Shaw Genetic Engineering—Principles and Methods , 1986 .

[34]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[35]  N Williams,et al.  Yeast Genome Sequence Ferments New Research , 1996, Science.

[36]  Arthur M. Lesk Computational Molecular Biology: Sources and Methods for Sequence Analysis , 1989 .

[37]  David Ghosh,et al.  Status of the transcription factors database (TFD) , 1993, Nucleic Acids Res..

[38]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[39]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[40]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[41]  M S Waterman,et al.  Genomic sequence databases. , 1990, Genomics.

[42]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[43]  Amos Bairoch,et al.  LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces , 1994, Nucleic Acids Res..

[44]  Amos Bairoch,et al.  The ENZYME data bank in 1995 , 1996, Nucleic Acids Res..

[45]  D Penny,et al.  Progress with methods for constructing evolutionary trees. , 1992, Trends in ecology & evolution.

[46]  James I. Garrels,et al.  YPD-A database for the proteins of Saccharomyces cerevisiae , 1996, Nucleic Acids Res..

[47]  Timothy F. Havel,et al.  NMR structure determination in solution: a critique and comparison with X-ray crystallography. , 1992, Annual review of biophysics and biomolecular structure.

[48]  Thomas Madej,et al.  Threading analysis suggests that the obese gene product may be a helical cytokine , 1995, FEBS letters.

[49]  A Coulson,et al.  High-performance searching of biosequence databases. , 1994, Trends in biotechnology.

[50]  James W. Fickett,et al.  The Gene Identification Problem: An Overview for Developers , 1995, Comput. Chem..

[51]  Rene Devos,et al.  Identification and expression cloning of a leptin receptor, OB-R , 1995, Cell.

[52]  M. Adams,et al.  Automated DNA sequencing and analysis. , 1994 .

[53]  S T Cole,et al.  MycDB: an integrated mycobacterial database , 1994, Molecular microbiology.

[54]  Monica Riley,et al.  Genes and proteins of Escherichia coli (GenProtEc) , 1996, Nucleic Acids Res..

[55]  Amos Bairoch,et al.  The PROSITE database, its status in 1995 , 1996, Nucleic Acids Res..

[56]  Stanley Letovsky,et al.  Improvements to the GDB Human Genome Data Base , 1996, Nucleic Acids Res..

[57]  Mikhail S. Gelfand,et al.  Prediction of Function in DNA Sequence , 1995, J. Comput. Biol..

[58]  A. K. Wong,et al.  A survey of multiple sequence comparison methods. , 1992, Bulletin of mathematical biology.

[59]  T K Attwood,et al.  OWL--a non-redundant composite protein sequence database. , 1994, Nucleic acids research.

[60]  V. Giudicelli,et al.  LIGM‐DB/IMGT: An Integrated Database of Ig and TcR, Part of the Immunogenetics Database a , 1995 .

[61]  Robert Langridge,et al.  Mapping and interpreting biological information , 1991, CACM.

[62]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[63]  Manfred Kröger,et al.  Compilation of DNA sequences of Escherichia coli K12 (ECD and ECDC; update 1995) , 1996, Nucleic Acids Res..

[64]  H. Prydz,et al.  CpG islands as gene markers in the human genome. , 1992, Genomics.

[65]  Shmuel Pietrokovski,et al.  The Blocks database--a system for protein classification , 1996, Nucleic Acids Res..

[66]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[67]  Eugene W. Myers,et al.  Toward Simplifying and Accurately Formulating Fragment Assembly , 1995, J. Comput. Biol..

[68]  Terry Gaasterland,et al.  The metabolic pathway collection from EMP: the enzymes and metabolic pathways database , 1996, Nucleic Acids Res..

[69]  J. Fickett,et al.  Assessment of protein coding measures. , 1992, Nucleic acids research.

[70]  Christopher J. Fox,et al.  Sixteen questions about software reuse , 1995, CACM.

[71]  Douglas W. Smith Biocomputing: informatics and genome projects. , 1994 .

[72]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[73]  C Sander,et al.  Structure prediction of proteins--where are we now? , 1994, Current opinion in biotechnology.

[74]  J M Thornton,et al.  NMR and crystallography--complementary approaches to structure determination. , 1994, Trends in biotechnology.

[75]  J. Garrels The QUEST system for quantitative analysis of two-dimensional gels. , 1989, The Journal of biological chemistry.

[76]  Hans-Werner Mewes,et al.  The PIR-International Protein Sequence Database , 1992, Nucleic Acids Res..

[77]  E. S. Lander,et al.  Calculating the secrets of life: Applications of the mathematical sciences in molecular biology , 1995 .

[78]  Owen White,et al.  TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects , 1995 .

[79]  Sándor Pongor,et al.  The SBASE protein domain library, Release 4.0: a collection of annotated protein sequence segments , 1993, Nucleic Acids Res..

[80]  Janan T. Eppig,et al.  Building an integrated mouse genome database: a view from the front line , 1995 .

[81]  O Ritter,et al.  Prototype implementation of the integrated genomic database. , 1994, Computers and biomedical research, an international journal.

[82]  Tim Hunkapiller,et al.  Lims and the Human Genome Project , 1991, Bio/Technology.

[83]  E. Sonnhammer,et al.  Modular arrangement of proteins as inferred from analysis of homology , 1994, Protein science : a publication of the Protein Society.