Basic protein structure prediction for the biologist: A review

As the field of protein structure prediction continues to expand at an exponential rate, the bench-biologist might feel overwhelmed by the sheer range of available applications. This review presents the three main approaches in computational structure prediction from a non-bioinformatician’s point of view and makes a selection of tools and servers freely available. These tools are evaluated from several aspects, such as number of citations, ease of usage and quality of the results. Finally, the applications of models generated by computational structure prediction are discussed.

[1]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[2]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Ron Elber,et al.  Enriching the sequence substitution matrix by structural information , 2003, Proteins.

[4]  Krzysztof Fidelis,et al.  CASP8 results in context of previous experiments , 2009, Proteins.

[5]  A Sali,et al.  Comparative protein modeling by satisfaction of spatial restraints. , 1996, Molecular medicine today.

[6]  András Fiser,et al.  ModLoop: automated modeling of loops in protein structures , 2003, Bioinform..

[7]  Cyrus Chothia,et al.  The SUPERFAMILY database in 2007: families and functions , 2006, Nucleic Acids Res..

[8]  A Sali,et al.  Site-directed mutagenesis of recombinant human beta 2-glycoprotein I identifies a cluster of lysine residues that are critical for phospholipid binding and anti-cardiolipin antibody activity. , 1996, Journal of immunology.

[9]  Ronald M Levy,et al.  Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update. , 2003, Protein engineering.

[10]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[11]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[12]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[13]  J. Skolnick,et al.  The PDB is a covering set of small protein structures. , 2003, Journal of molecular biology.

[14]  Lars Malmström,et al.  Automated prediction of CASP‐5 structures using the Robetta server , 2003, Proteins.

[15]  François Stricher,et al.  The FoldX web server: an online force field , 2005, Nucleic Acids Res..

[16]  Torsten Schwede,et al.  Automated protein structure homology modeling: a progress report. , 2004, Pharmacogenomics.

[17]  Cyrus Chothia,et al.  SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny , 2008, Nucleic Acids Res..

[18]  Yang Zhang,et al.  I‐TASSER: Fully automated protein structure prediction in CASP8 , 2009, Proteins.

[19]  Prasanna R Kolatkar,et al.  Assessment of CASP7 structure predictions for template free targets , 2007, Proteins.

[20]  A. Fiser Protein structure modeling in the proteomics era , 2004, Expert review of proteomics.

[21]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[22]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[23]  J Meller,et al.  Linear programming optimization and a double statistical filter for protein threading protocols , 2001, Proteins.

[24]  Alfonso Valencia,et al.  Assessment of predictions submitted for the CASP7 function prediction category. , 2007, Proteins.

[25]  M J Sternberg,et al.  Enhancement of protein modeling by human intervention in applying the automatic programs 3D‐JIGSAW and 3D‐PSSM , 2001, Proteins.

[26]  Torsten Schwede,et al.  BIOINFORMATICS Bioinformatics Advance Access published November 12, 2005 The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling , 2022 .

[27]  T. Schwede,et al.  Protein structure homology modeling using SWISS-MODEL workspace , 2008, Nature Protocols.

[28]  M. Doherty,et al.  A Random Mutagenesis Approach to Isolate Dominant-Negative Yeast sec1 Mutants Reveals a Functional Role for Domain 3a in Yeast and Mammalian Sec1/Munc18 Proteins , 2008, Genetics.

[29]  Daisuke Kihara,et al.  Quality assessment of protein structure models. , 2009, Current protein & peptide science.

[30]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[31]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[32]  Torsten Schwede,et al.  The SWISS-MODEL Repository and associated resources , 2008, Nucleic Acids Res..

[33]  A. Fiser,et al.  Convergent evolution of Trichomonas vaginalis lactate dehydrogenase from malate dehydrogenase. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[35]  David Kim,et al.  Assessment of predictions submitted for the CASP7 domain prediction category , 2007, Proteins.

[36]  Richard Bonneau,et al.  De novo prediction of three-dimensional structures for major protein families. , 2002, Journal of molecular biology.

[37]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[38]  Daniel Fischer,et al.  Servers for protein structure prediction. , 2006, Current opinion in structural biology.

[39]  Christophe Combet,et al.  Geno3D: automatic comparative molecular modelling of protein , 2002, Bioinform..

[40]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[41]  W U Primrose,et al.  A model for human cytochrome P450 2D6 based on homology modeling and NMR studies of substrate binding. , 1996, Biochemistry.

[42]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[43]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[44]  Cyrus Chothia,et al.  The SUPERFAMILY database in 2004: additions and improvements , 2004, Nucleic Acids Res..

[45]  Christophe G. Lambert,et al.  ESyPred3D: Prediction of proteins 3D structures , 2002, Bioinform..

[46]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[47]  G A Petsko,et al.  Structure determination of turkey egg-white lysozyme using Laue diffraction data. , 1992, Acta crystallographica. Section B, Structural science.

[48]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[49]  Lei Xie,et al.  Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling , 2003, Proteins.

[50]  Narayanan Eswar,et al.  MODBASE, a database of annotated comparative protein structure models , 2002, Nucleic Acids Res..

[51]  Haruki Nakamura,et al.  Data Deposition and Annotation at the Worldwide Protein Data Bank , 2009, Molecular biotechnology.

[52]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[53]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[54]  Burkhard Rost,et al.  META-PP: single interface to crucial prediction servers , 2003, Nucleic Acids Res..

[55]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[56]  András Fiser,et al.  Probing the specificity of a trypanosomal aromatic alpha-hydroxy acid dehydrogenase by site-directed mutagenesis. , 2002, Biochemical and biophysical research communications.

[57]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[58]  A. Kolinski Protein modeling and structure prediction with a reduced representation. , 2004, Acta biochimica Polonica.

[59]  Randy J Read,et al.  Assessment of CASP7 predictions in the high accuracy template‐based modeling category , 2007, Proteins.

[60]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[61]  Yang Zhang Protein structure prediction: when is it useful? , 2009, Current opinion in structural biology.

[62]  Yang Zhang,et al.  Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[63]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..

[64]  C. Lambert,et al.  ESyPred 3 D : Prediction of proteins 3 D structures , 2002 .

[65]  R Sánchez,et al.  Evaluation of comparative protein structure modeling by MODELLER‐3 , 1997, Proteins.

[66]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[67]  Arne Elofsson,et al.  All are not equal: A benchmark of different homology modeling programs , 2005, Protein science : a publication of the Protein Society.

[68]  Z. Xiang,et al.  Advances in homology protein structure modeling. , 2006, Current protein & peptide science.

[69]  Sitao Wu,et al.  LOMETS: A local meta-threading-server for protein structure prediction , 2007, Nucleic acids research.

[70]  John Moult,et al.  A unifold, mesofold, and superfold model of protein fold use , 2002, Proteins.

[71]  M. Karplus,et al.  Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[72]  Eaton E. Lattman,et al.  Protein structure prediction: A special issue , 1995 .

[73]  C. Chothia,et al.  Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. , 2001, Journal of molecular biology.

[74]  Adam Liwo,et al.  Recent improvements in prediction of protein structure by global optimization of a potential energy function , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[75]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[76]  Yang Zhang,et al.  Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment , 2004, Bioinform..

[77]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[78]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[79]  C A Floudas,et al.  Computational methods in protein structure prediction. , 2007, Biotechnology and bioengineering.

[80]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[81]  Ying Xu,et al.  A historical perspective of template-based protein structure prediction. , 2008, Methods in molecular biology.

[82]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[83]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[84]  Samuel L. DeLuca,et al.  Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You , 2010, Biochemistry.

[85]  Daniel Fischer,et al.  ‘Meta’Approaches to Protein Structure Prediction , 2008 .

[86]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[87]  F E Cohen,et al.  Structure-based inhibitor design by using protein models for the development of antiparasitic agents. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[88]  I. Vakser Protein docking for low-resolution structures. , 1995, Protein engineering.

[89]  Alnawaz Rehemtulla,et al.  CXCR7 (RDC1) promotes breast and lung tumor growth in vivo and is expressed on tumor-associated vasculature , 2007, Proceedings of the National Academy of Sciences.

[90]  Manuel C. Peitsch,et al.  Protein Modeling by E-mail , 1995, Bio/Technology.

[91]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[92]  G Kolata Trying to crack the second half of the genetic code. , 1986, Science.

[93]  D. Baker,et al.  Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection , 2006, Nucleic acids research.

[94]  Michael Nilges,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btl655 Structural bioinformatics Biskit—A software platform for structural bioinformatics , 2006 .