Protein Structure Modeling

Known protein sequences outnumber known protein structures by more than two orders of magnitude. Given this huge sequence-structure gap, most protein structures need to be predicted by computational methods rather than determined by experimental techniques. This chapter outlines various protein structure modeling approaches and associated resources.

[1]  Christine A. Orengo,et al.  Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint , 2007, BMC Bioinformatics.

[2]  A. Sali,et al.  The molecular sociology of the cell , 2007, Nature.

[3]  Steven E Brenner,et al.  The Impact of Structural Genomics: Expectations and Outcomes , 2005, Science.

[4]  Peter Tompa,et al.  The role of structural disorder in the function of RNA and protein chaperones , 2004, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[5]  Christopher J. Oldfield,et al.  Intrinsic disorder and functional proteomics. , 2007, Biophysical journal.

[6]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[7]  Alexander V. Diemand,et al.  Modeling AAA+ ring complexes from monomeric structures. , 2006, Journal of structural biology.

[8]  Ben M. Webb,et al.  Protein structure fitting and refinement guided by cryo-EM density. , 2008, Structure.

[9]  John Moult,et al.  Comparative modeling in structural genomics. , 2008, Structure.

[10]  P. Radivojac,et al.  PROTEINS: Structure, Function, and Bioinformatics Suppl 7:176–182 (2005) Exploiting Heterogeneous Sequence Properties Improves Prediction of Protein Disorder , 2022 .

[11]  Liam J. McGuffin,et al.  Improvement of the GenTHREADER Method for Genomic Fold Recognition , 2003, Bioinform..

[12]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[13]  S Vajda,et al.  Discrimination of near‐native protein structures from misfolded models by empirical free energy functions , 2000, Proteins.

[14]  Yaoqi Zhou,et al.  SPARKS 2 and SP3 servers in CASP6 , 2005, Proteins.

[15]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[16]  Jianhan Chen,et al.  Can molecular dynamics simulations provide high‐resolution refinement of protein structure? , 2007, Proteins.

[17]  Roberto Sánchez,et al.  ModBase: A database of comparative protein structure models , 1999, Bioinform..

[18]  J. Hermans,et al.  Free energies of protein decoys provide insight into determinants of protein stability , 2001, Protein science : a publication of the Protein Society.

[19]  D. Baker,et al.  Multipass membrane protein structure prediction using Rosetta , 2005, Proteins.

[20]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[21]  A. Fink Natively unfolded proteins. , 2005, Current opinion in structural biology.

[22]  R. Stevens,et al.  High-Resolution Crystal Structure of an Engineered Human β2-Adrenergic G Protein–Coupled Receptor , 2007, Science.

[23]  R. Stevens,et al.  GPCR Engineering Yields High-Resolution Structural Insights into β2-Adrenergic Receptor Function , 2007, Science.

[24]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[25]  C. Orengo,et al.  Protein families and their evolution-a structural perspective. , 2005, Annual review of biochemistry.

[26]  Oliver F. Lange,et al.  Consistent blind protein structure generation from NMR chemical shift data , 2008, Proceedings of the National Academy of Sciences.

[27]  Roland L Dunbrack,et al.  Outcome of a workshop on archiving structural models of biological macromolecules. , 2006, Structure.

[28]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[29]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[30]  M. Karplus,et al.  Discrimination of the native from misfolded protein models with an energy function including implicit solvation. , 1999, Journal of molecular biology.

[31]  Roland L Dunbrack,et al.  Scoring profile‐to‐profile sequence alignments , 2004, Protein science : a publication of the Protein Society.

[32]  Philip E. Bourne,et al.  The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..

[33]  David E. Kim,et al.  Physically realistic homology models built with ROSETTA can be more accurate than their templates. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[35]  R Henderson,et al.  The structure of bacteriorhodopsin and its relevance to the visual opsins and other seven-helix G-protein coupled receptors. , 1990, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[36]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[37]  Alejandra Leo-Macias,et al.  An efficient conformational sampling method for homology modeling , 2008, Proteins.

[38]  K. Palczewski,et al.  Crystal Structure of Rhodopsin: A G‐Protein‐Coupled Receptor , 2002, Chembiochem : a European journal of chemical biology.

[39]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[40]  S. Brenner,et al.  Update on the Pfam5000 Strategy for Selection of Structural Genomics Targets , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[41]  Akihiro Yamaguchi,et al.  Enlarged FAMSBASE: protein 3D structure models of genome sequences for 41 species , 2003, Nucleic Acids Res..

[42]  M. Baker,et al.  Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. , 2006, Journal of molecular biology.

[43]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.

[44]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[45]  Krzysztof Fidelis,et al.  Progress from CASP6 to CASP7 , 2007, Proteins.

[46]  Jeffrey Skolnick,et al.  M-TASSER: an algorithm for protein quaternary structure prediction. , 2008, Biophysical journal.

[47]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[48]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[49]  Leszek Rychlewski,et al.  The challenge of protein structure determination—lessons from structural genomics , 2007, Protein science : a publication of the Protein Society.

[50]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[51]  Bernard F. Buxton,et al.  The DISOPRED server for the prediction of protein disorder , 2004, Bioinform..

[52]  P Herzyk,et al.  Automated method for modeling seven-helix transmembrane receptors from experimental data. , 1995, Biophysical journal.

[53]  I. Sylte,et al.  Molecular dynamics of dopamine at the D2 receptor. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Vijay S Pande,et al.  Local structure formation in simulations of two small proteins. , 2007, Journal of structural biology.

[55]  E. Pebay-Peyroula,et al.  X-ray structure of bacteriorhodopsin at 2.5 angstroms from microcrystals grown in lipidic cubic phases. , 1997, Science.

[56]  Neil D. Rawlings,et al.  MEROPS: the peptidase database , 2009, Nucleic Acids Res..

[57]  B. Chait,et al.  Determining the architectures of macromolecular assemblies , 2007, Nature.

[58]  R D Appel,et al.  Large‐scale protein modelling and integration with the SWISS‐PROT and SWISS‐2DPAGE databases: The example of Escherichia coli , 1997, Electrophoresis.

[59]  Jinfeng Liu,et al.  Novel leverage of structural genomics , 2007, Nature Biotechnology.

[60]  Frank Alber,et al.  Integrating diverse data for structure determination of macromolecular assemblies. , 2008, Annual review of biochemistry.

[61]  Randy J Read,et al.  Automated server predictions in CASP7 , 2007, Proteins.

[62]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[63]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[64]  M C Peitsch,et al.  Automated modelling of the transmembrane region of G-protein coupled receptor by Swiss-model. , 1996, Receptors & channels.

[65]  Torsten Schwede,et al.  Assessment of disorder predictions in CASP7 , 2007, Proteins.

[66]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[67]  Russell L. Marsden,et al.  Progress of structural genomics initiatives: an analysis of solved target structures. , 2005, Journal of molecular biology.

[68]  Brian D. Marsden,et al.  The scientific impact of the Structural Genomics Consortium: a protein family and ligand-centered approach to medically-relevant human proteins , 2007, Journal of Structural and Functional Genomics.

[69]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[70]  Leszek Rychlewski,et al.  LiveBench‐8: The large‐scale, continuous assessment of automated protein structure prediction , 2005, Protein science : a publication of the Protein Society.

[71]  Markus Meuwly,et al.  How inaccuracies in protein structure models affect estimates of protein–ligand interactions: Computational analysis of HIV‐I protease inhibitor binding , 2006, Proteins.

[72]  B. Honig,et al.  A hierarchical approach to all‐atom protein loop prediction , 2004, Proteins.

[73]  Adam Godzik,et al.  Fold recognition methods. , 2005, Methods of biochemical analysis.

[74]  Rama Ranganathan,et al.  Knowledge-based potentials in protein design. , 2006, Current opinion in structural biology.

[75]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[76]  A. Caflisch,et al.  Kinetic analysis of molecular dynamics simulations reveals changes in the denatured state and switch of folding pathways upon single‐point mutation of a β‐sheet miniprotein , 2008, Proteins.

[77]  Marc A. Martí-Renom,et al.  Tools for comparative protein structure modeling and analysis , 2003, Nucleic Acids Res..

[78]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[79]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[80]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[81]  Michael Levitt,et al.  Growth of novel protein structural data , 2007, Proceedings of the National Academy of Sciences.

[82]  Barry Honig,et al.  Loop modeling: Sampling, filtering, and scoring , 2007, Proteins.

[83]  Stephen K. Burley,et al.  An overview of structural genomics , 2000, Nature Structural Biology.

[84]  Randy J Read,et al.  Assessment of CASP7 predictions in the high accuracy template‐based modeling category , 2007, Proteins.

[85]  J. Ballesteros,et al.  G protein-coupled receptor drug discovery: implications from the crystal structure of rhodopsin. , 2001, Current opinion in drug discovery & development.

[86]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[87]  András Fiser,et al.  Molecular Biophysics , 2022 .

[88]  Torsten Schwede,et al.  The SWISS-MODEL Repository of annotated three-dimensional protein structure homology models , 2004, Nucleic Acids Res..

[89]  M C Peitsch,et al.  A 3-D model for the CD40 ligand predicts that it is a compact trimer similar to the tumor necrosis factors. , 1993, International immunology.

[90]  Yang Zhang,et al.  Structure Modeling of All Identified G Protein–Coupled Receptors in the Human Genome , 2006, PLoS Comput. Biol..

[91]  Marc A. Martí-Renom,et al.  EVA: continuous automatic evaluation of protein structure prediction servers , 2001, Bioinform..

[92]  Anna Tramontano,et al.  Automatic procedure for using models of proteins in molecular replacement , 2006, Proteins.

[93]  D. Baker,et al.  Modeling structurally variable regions in homologous proteins with rosetta , 2004, Proteins.

[94]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[95]  Gert Vriend,et al.  Increasing the precision of comparative models with YASARA NOVA—a self‐parameterizing force field , 2002, Proteins.

[96]  Sébastien Carrère,et al.  The ProDom database of protein domain families: more emphasis on 3D , 2004, Nucleic Acids Res..

[97]  Yang Zhang,et al.  TASSER‐based refinement of NMR structures , 2006, Proteins.

[98]  Peer Bork,et al.  Microbial Cell Factories Structural Genomics of Human Proteins – Target Selection and Generation of a Public Catalogue of Expression Clones , 2022 .

[99]  M. S. Chapman,et al.  Study of the Structural Dynamics of the E. coli 70S Ribosome Using Real-Space Refinement , 2003, Cell.

[100]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[101]  G Vriend,et al.  Heavier‐than‐air flying machines are impossible , 2004, FEBS letters.

[102]  Richard Bonneau,et al.  An improved protein decoy set for testing energy functions for protein structure prediction , 2003, Proteins.

[103]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[104]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[105]  Narayanan Eswar,et al.  MODBASE, a database of annotated comparative protein structure models , 2002, Nucleic Acids Res..

[106]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.

[107]  S Banu Ozkan,et al.  The protein folding problem: when will it be solved? , 2007, Current opinion in structural biology.

[108]  S. Brenner,et al.  Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches , 2004, Proteins.

[109]  M C Peitsch,et al.  Automated protein modelling--the proteome in 3D. , 2000, Pharmacogenomics.

[110]  K. Misura,et al.  PROTEINS: Structure, Function, and Bioinformatics 59:15–29 (2005) Progress and Challenges in High-Resolution Refinement of Protein Structure Models , 2022 .

[111]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[112]  Hongyi Zhou,et al.  An accurate, residue‐level, pair potential of mean force for folding and binding based on the distance‐scaled, ideal‐gas reference state , 2004, Protein science : a publication of the Protein Society.

[113]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[114]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[115]  Robert B Russell,et al.  The hard cell: From proteomics to a whole cell model , 2007, FEBS letters.

[116]  Stylianos E. Antonarakis,et al.  Mutations in the TMPRSS3 gene are a rare cause of childhood nonsyndromic deafness in Caucasian patients , 2002, Journal of Molecular Medicine.

[117]  Arne Elofsson,et al.  Pcons.net: protein structure prediction meta server , 2007, Nucleic Acids Res..

[118]  András Fiser,et al.  Modeling mutations in protein structures , 2007, Protein science : a publication of the Protein Society.

[119]  A. Sali,et al.  Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[120]  Manuel C. Peitsch,et al.  Protein Modeling by E-mail , 1995, Bio/Technology.

[121]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[122]  Gaetano T Montelione,et al.  Automatic target selection for structural genomics on eukaryotes , 2004, Proteins.

[123]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[124]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[125]  J Thornton,et al.  Structural genomics takes off. , 2001, Trends in biochemical sciences.

[126]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[127]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[128]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[129]  Yang Zhang,et al.  Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[130]  Scott McMillan,et al.  A compilation of molecular biology web servers: 2006 update on the Bioinformatics Links Directory , 2006, Nucleic Acids Res..

[131]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[132]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[133]  Andrej Sali,et al.  Fold assessment for comparative protein structure modeling , 2007, Protein science : a publication of the Protein Society.

[134]  A. Sali,et al.  Comparative Modeling of Drug Target Proteins , 2007, Comprehensive Medicinal Chemistry II.

[135]  R. Hilgenfeld,et al.  Utility of homology models in the drug discovery process , 2004, Drug Discovery Today.

[136]  A. Sali 100,000 protein structures for the biologist , 1998, Nature Structural Biology.

[137]  Leszek Rychlewski,et al.  FFAS03: a server for profile–profile sequence alignments , 2005, Nucleic Acids Res..

[138]  Narayanan Eswar,et al.  Structure of the mammalian 80S ribosome at 8.7 A resolution. , 2008, Structure.

[139]  Avner Schlessinger,et al.  Natively Unstructured Loops Differ from Other Loops , 2007, PLoS Comput. Biol..

[140]  David Baker,et al.  Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[141]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[142]  K. Namba Roles of partly unfolded conformations in macromolecular self‐assembly , 2001, Genes to cells : devoted to molecular & cellular mechanisms.

[143]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[144]  Qianqian Zhu,et al.  How well can we predict native contacts in proteins based on decoy structures and their energies? , 2003, Proteins.

[145]  D. Baker,et al.  The fumarate sensor DcuS: progress in rapid protein fold elucidation by combining protein structure prediction methods with NMR spectroscopy. , 2005, Journal of magnetic resonance.

[146]  Manuel C. Peitsch,et al.  Large Scale Protein Modeling and Model Repository , 1997, ISMB.

[147]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[148]  D. Fabbro,et al.  Discovery of a potent and selective protein kinase CK2 inhibitor by high-throughput docking. , 2003, Journal of medicinal chemistry.

[149]  Prasanna R Kolatkar,et al.  Assessment of CASP7 structure predictions for template free targets , 2007, Proteins.

[150]  J. Skolnick,et al.  How well is enzyme function conserved as a function of pairwise sequence identity? , 2003, Journal of molecular biology.

[151]  D. Baker,et al.  Toward high-resolution prediction and design of transmembrane helical protein structures , 2007, Proceedings of the National Academy of Sciences.

[152]  B. Chait,et al.  The molecular architecture of the nuclear pore complex , 2007, Nature.

[153]  Narayanan Eswar,et al.  Simple fold composition and modular architecture of the nuclear pore complex , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[154]  A. Sali,et al.  Alignment of protein sequences by their profiles , 2004, Protein science : a publication of the Protein Society.

[155]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.

[156]  B. Rost Enzyme function less conserved than anticipated. , 2002, Journal of molecular biology.

[157]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[158]  Genki Terashi,et al.  Protein structure prediction in CASP6 using CHIMERA and FAMS , 2005, Proteins.

[159]  Adrian A Canutescu,et al.  Access the most recent version at doi: 10.1110/ps.03154503 References , 2003 .

[160]  D Fischer,et al.  LiveBench‐1: Continuous benchmarking of protein structure prediction servers , 2001, Protein science : a publication of the Protein Society.

[161]  W A Koppensteiner,et al.  Sustained performance of knowledge‐based potentials in fold recognition , 1999, Proteins.

[162]  D. Baker,et al.  Clustering of low-energy conformations near the native structures of small proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[163]  Lars Malmström,et al.  Structure prediction for CASP7 targets using extensive all‐atom refinement with Rosetta@home , 2007, Proteins.

[164]  A. Sali,et al.  A composite score for predicting errors in protein structure models , 2006, Protein science : a publication of the Protein Society.

[165]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[166]  David Baker,et al.  Prediction of the structure of symmetrical protein assemblies , 2007, Proceedings of the National Academy of Sciences.

[167]  Torsten Schwede,et al.  The SWISS-MODEL Repository: new features and functionalities , 2005, Nucleic Acids Res..

[168]  John D. Westbrook,et al.  TargetDB: a target registration database for structural genomics projects , 2004, Bioinform..

[169]  M C Peitsch,et al.  Comparative molecular modelling of the Fas-ligand and other members of the TNF family. , 1995, Molecular immunology.

[170]  J. Ben Rosen,et al.  MOPED: Method for optimizing physical energy parameters using decoys , 2003, J. Comput. Chem..

[171]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.