Pedestrian guide to analyzing sequence databases

[1]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[2]  K Fidelis,et al.  A large‐scale experiment to assess protein structure prediction methods , 1995, Proteins.

[3]  S. Knudsen,et al.  Prediction of human mRNA donor and acceptor sites from the DNA sequence. , 1991, Journal of molecular biology.

[4]  T L Blundell,et al.  Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures. , 1994, Journal of molecular biology.

[5]  N. Colloc'h,et al.  Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment. , 1993, Protein engineering.

[6]  B Shomer,et al.  Information services of the European Bioinformatics Institute. , 1996, Methods in enzymology.

[7]  John P. Overington,et al.  Environment‐specific amino acid substitution tables: Tertiary templates and prediction of protein folds , 1992, Protein science : a publication of the Protein Society.

[8]  Burkhard Rost,et al.  Refining Neural Network Predictions for Helical Transmembrane Proteins by Dynamic Programming , 1996, ISMB.

[9]  Geoffrey J. Barton,et al.  Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation , 1993, Comput. Appl. Biosci..

[10]  D. Shortle Protein fold recognition , 1995, Nature Structural Biology.

[11]  M. Gribskov,et al.  [9] Profile analysis , 1990 .

[12]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments , 1993, Nucleic Acids Res..

[13]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[14]  M J Sternberg,et al.  A simple method to generate non-trivial alternate alignments of protein sequences. , 1991, Journal of molecular biology.

[15]  Thomas L. Madden,et al.  Applications of network BLAST server. , 1996, Methods in enzymology.

[16]  C. Chothia,et al.  Understanding protein structure: using scop for fold interpretation. , 1996, Methods in enzymology.

[17]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its new supplement TREMBL , 1996, Nucleic Acids Res..

[18]  M. Zuker Suboptimal sequence alignment in molecular biology. Alignment with error analysis. , 1991, Journal of molecular biology.

[19]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[20]  G J Barton,et al.  Protein secondary structure prediction. , 1995, Current opinion in structural biology.

[21]  R Abagyan,et al.  Homology modeling by the ICM method , 1995, Proteins.

[22]  B. Dujon The yeast genome project: what did we learn? , 1996, Trends in genetics : TIG.

[23]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[24]  Patrick Argos,et al.  [10] Prediction of protein structure , 1986 .

[25]  M J Sippl,et al.  Progress in fold recognition , 1995, Proteins.

[26]  P Argos,et al.  A method to configure protein side-chains from the main-chain trace in homology modelling. , 1993, Journal of molecular biology.

[27]  R F Doolittle,et al.  Convergent evolution: the need to be explicit. , 1994, Trends in biochemical sciences.

[28]  Sean R. Eddy,et al.  Multiple Alignment Using Hidden Markov Models , 1995, ISMB.

[29]  R. Abagyan,et al.  Biased probability Monte Carlo conformational searches and electrostatic calculations for peptides and proteins. , 1994, Journal of molecular biology.

[30]  Philipp Bucher,et al.  A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System , 1996, ISMB.

[31]  C Sander,et al.  Structure prediction of proteins--where are we now? , 1994, Current opinion in biotechnology.

[32]  Eugene V. Koonin,et al.  [18] Protein sequence comparison at genome scale , 1996 .

[33]  E. Neher How frequent are correlated changes in families of protein sequences? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[34]  B Rost,et al.  Bridging the protein sequence-structure gap by structure predictions. , 1996, Annual review of biophysics and biomolecular structure.

[35]  S. Wodak,et al.  Protein structure prediction by threading methods: Evaluation of current techniques , 1995, Proteins.

[36]  W. Pearson Effective protein sequence comparison. , 1996, Methods in enzymology.

[37]  W. Taylor,et al.  The classification of amino acid conservation. , 1986, Journal of theoretical biology.

[38]  B. Rost,et al.  Data Based Modeling of Proteins , 1994 .

[39]  K. Hatrick,et al.  Compensating changes in protein multiple sequence alignments. , 1994, Protein engineering.

[40]  Terry Gaasterland,et al.  Reconstruction of Metabolic Networks Using Incomplete Information , 1995, ISMB.

[41]  P. Kollman,et al.  Molecular mechanical potential functions and their application to study molecular systems: Current Opinion in Structural Biology 1991, 1:201–212 , 1991 .

[42]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[43]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[44]  S A Benner,et al.  Bona fide prediction of aspects of protein conformation. Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. , 1994, Journal of molecular biology.

[45]  S H Kim,et al.  Predicting surface exposure of amino acids from protein sequence. , 1990, Protein engineering.

[46]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[47]  B Rost,et al.  Pitfalls of protein sequence analysis. , 1996, Current opinion in biotechnology.

[48]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[49]  Burkhard Rost,et al.  TOPITS: Threading One-Dimensional Predictions Into Three-Dimensional Structures , 1995, ISMB.

[50]  A. D. McLachlan,et al.  Sequence comparison by exponentially-damped alignment , 1984, Nucleic Acids Res..

[51]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[52]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[53]  R. Doolittle Of urfs and orfs : a primer on how to analyze devised amino acid sequences , 1986 .

[54]  E E Lattman,et al.  Protein crystallography for all , 1994, Proteins.

[55]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[56]  P Argos,et al.  Prediction of transmembrane segments in proteins utilising multiple sequence alignments. , 1994, Journal of molecular biology.

[57]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[58]  J. Wootton,et al.  Analysis of compositionally biased regions in sequence databases. , 1996, Methods in enzymology.

[59]  Reinhard Schneider Sequenz und Sequenz-Struktur: Vergleiche und deren Anwendung für die Struktur- und Funktionsvorhersage von Proteinen , 1994 .

[60]  M. Sternberg,et al.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. , 1987, Journal of molecular biology.

[61]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[62]  M. Delarue,et al.  Converting sequence block alignments into structural insights. , 1996, Methods in enzymology.

[63]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[64]  B. Rost,et al.  Transmembrane helices predicted at 95% accuracy , 1995, Protein science : a publication of the Protein Society.

[65]  S. Henikoff,et al.  Blocks database and its applications. , 1996, Methods in enzymology.

[66]  M. Johnston,et al.  Towards a complete understanding of how a simple eukaryotic cell works. , 1996, Trends in genetics : TIG.

[67]  E. Uberbacher,et al.  Discovering and understanding genes in human DNA sequence using GRAIL. , 1996, Methods in enzymology.

[68]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[69]  J. Risler,et al.  Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix. , 1988, Journal of molecular biology.

[70]  S F Altschul,et al.  Local alignment statistics. , 1996, Methods in enzymology.

[71]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[72]  C. Sander,et al.  Quality control of protein models : directional atomic contact analysis , 1993 .

[73]  K Nishikawa,et al.  The amino acid composition is different between the cytoplasmic and extracellular sides in membrane proteins , 1992, FEBS letters.

[74]  Peter D. Karp,et al.  HinCyc: A Knowledge Base of the Complete Genome and Metabolic Pathways of H. influenzae , 1996, ISMB.

[75]  Smith Rf,et al.  Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992 .

[76]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[77]  Martin Vingron,et al.  A fast and sensitive multiple sequence alignment algorithm , 1989, Comput. Appl. Biosci..

[78]  M. Gribskov,et al.  [13] Identification of sequence patterns with profile analysis , 1996 .

[79]  A Tsugita,et al.  The PIR-International Protein Sequence Database. , 1996, Nucleic acids research.

[80]  C Sander,et al.  Progress in protein structure prediction? , 1993, Trends in biochemical sciences.

[81]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[82]  W R Taylor,et al.  Multiple protein sequence alignment: algorithms and gap insertion. , 1996, Methods in enzymology.

[83]  W. C. Barker Of URFs and ORFs: A primer on how to analyze derived amino acid sequences: Russell F. Doolittle, University Science Books, Mill Valley, CA. Paperback. Under $15 , 1987 .

[84]  G. Barton,et al.  The limits of protein secondary structure prediction accuracy from multiple sequence alignment. , 1993, Journal of molecular biology.

[85]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[86]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[87]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[88]  T L Blundell,et al.  Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. I. Solvent accessibility classes. , 1994, Journal of molecular biology.

[89]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[90]  G Vriend,et al.  Predicting local structural changes that result from point mutations. , 1994, Protein engineering.

[91]  M Levitt,et al.  Alignment of the amino acid sequences of distantly related proteins using variable gap penalties. , 1986, Protein engineering.

[92]  C Sander,et al.  The use of position‐specific rotamers in model building by homology , 1995, Proteins.

[93]  G. Gonnet,et al.  Exhaustive matching of the entire protein sequence database. , 1992, Science.

[94]  M Karplus,et al.  Modeling of globular proteins. A distance-based data search procedure for the construction of insertion/deletion regions and Pro----non-Pro mutations. , 1990, Journal of molecular biology.

[95]  O. Lund,et al.  Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase. , 1995, The Biochemical journal.

[96]  S. Oliver A network approach to the systematic analysis of yeast gene function. , 1996, Trends in genetics : TIG.

[97]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[98]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[99]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[100]  Chris Sander,et al.  The FSSP database: fold classification based on structure-structure alignment of proteins , 1996, Nucleic Acids Res..

[101]  Chris Sander,et al.  Jury returns on structure prediction , 1992, Nature.

[102]  M S Waterman,et al.  Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[103]  R. Doolittle Computer methods for macromolecular sequence analysis , 1996 .

[104]  T. Hubbard,et al.  Fold recognition and ab initio structure predictions using hidden markov models and β‐strand pair potentials , 1995, Proteins.

[105]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[106]  B Rost,et al.  Progress of 1D protein structure prediction at last , 1995, Proteins.

[107]  J. Gibrat,et al.  GOR method for predicting protein secondary structure from amino acid sequence. , 1996, Methods in enzymology.

[108]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[109]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[110]  T. P. Flores,et al.  Identification and classification of protein fold families. , 1993, Protein engineering.

[111]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[112]  M. Murata,et al.  Three-way Needleman--Wunsch algorithm. , 1990, Methods in enzymology.

[113]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[114]  Amos Bairoch,et al.  The PROSITE database, its status in 1995 , 1996, Nucleic Acids Res..

[115]  Robert B. Russell,et al.  Protein structure prediction , 1993, Nature.

[116]  B. Rost,et al.  Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[117]  T L Blundell,et al.  Automated comparative modelling of protein structures. , 1994, Current opinion in biotechnology.

[118]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[119]  P. Argos,et al.  Determination of reliable regions in protein sequence alignments. , 1990, Protein engineering.

[120]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[121]  Peter H. Sellers,et al.  An Algorithm for the Distance Between Two Finite Sequences , 1974, J. Comb. Theory, Ser. A.

[122]  S Brunak,et al.  Analysis of eukaryotic promoter sequences reveals a systematically occurring CT-signal. , 1995, Nucleic acids research.

[123]  T. Gibson,et al.  Applying motif and profile searches. , 1996, Methods in enzymology.

[124]  S Henikoff,et al.  Sequence analysis by electronic mail server. , 1993, Trends in biochemical sciences.

[125]  Shoshana J. Wodak,et al.  Generating and testing protein folds , 1993 .

[126]  A Tramontano,et al.  Update on protein structure prediction: results of the 1995 IRBM workshop. , 1996, Folding & design.

[127]  P. Argos,et al.  Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. , 1995, Critical reviews in biochemistry and molecular biology.

[128]  M. S. Johnson,et al.  Residue–Residue contact substitution probabilities derived from aligned three‐dimensional structures and the identification of common folds , 1994, Protein science : a publication of the Protein Society.

[129]  B. Rost,et al.  Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy , 1996, Protein science : a publication of the Protein Society.

[130]  A M Lesk,et al.  Comparison of the structures of globins and phycocyanins: Evidence for evolutionary relationship , 1990, Proteins.

[131]  G. von Heijne,et al.  Predicting the topology of eukaryotic membrane proteins. , 1993, European journal of biochemistry.

[132]  C Sander,et al.  Bioinformatics and the discovery of gene function. , 1996, Trends in genetics : TIG.

[133]  G. Heijne Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. , 1992, Journal of molecular biology.

[134]  C Sander,et al.  On the use of sequence homologies to predict protein structure: identical pentapeptides can have completely different conformations. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[135]  S. Knudsen,et al.  G+C-rich tract in 5' end of human introns. , 1992, Journal of molecular biology.

[136]  A. Mclachlan,et al.  Repeating sequences and gene duplication in proteins. , 1972, Journal of molecular biology.

[137]  William R. Taylor,et al.  Multiple sequence alignment by a pairwise algorithm , 1987, Comput. Appl. Biosci..

[138]  P. Argos,et al.  SRS: information retrieval system for molecular biology data banks. , 1996, Methods in enzymology.

[139]  Ernest Feytmans,et al.  MATCH-BOX: a fundamentally new algorithm for the simultaneous alignment of several protein sequences , 1992, Comput. Appl. Biosci..

[140]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[141]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[142]  M. A. McClure,et al.  Hidden Markov models of biological primary sequence information. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[143]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[144]  J. Garnier,et al.  Improving protein secondary structure prediction with aligned homologous sequences , 1996, Protein science : a publication of the Protein Society.