Protein modeling: what happened to the "protein structure gap"?

Computational modeling of three-dimensional macromolecular structures and complexes from their sequence has been a long-standing vision in structural biology. Over the last 2 decades, a paradigm shift has occurred: starting from a large "structure knowledge gap" between the huge number of protein sequences and small number of known structures, today, some form of structural information, either experimental or template-based models, is available for the majority of amino acids encoded by common model organism genomes. With the scientific focus of interest moving toward larger macromolecular complexes and dynamic networks of interactions, the integration of computational modeling methods with low-resolution experimental techniques allows the study of large and complex molecular machines. One of the open challenges for computational modeling and prediction techniques is to convey the underlying assumptions, as well as the expected accuracy and structural variability of a specific model, which is crucial to understanding its limitations.

[1]  Juergen Haas,et al.  The Protein Model Portal—a comprehensive resource for protein structure and model information , 2013, Database J. Biol. Databases Curation.

[2]  M. Levitt Nature of the protein universe , 2009, Proceedings of the National Academy of Sciences.

[3]  Oliver F. Lange,et al.  Structure prediction for CASP8 with all‐atom refinement using Rosetta , 2009, Proteins.

[4]  E. Lander,et al.  The Xist lncRNA Exploits Three-Dimensional Genome Architecture to Spread Across the X Chromosome , 2013, Science.

[5]  Andrej Sali,et al.  Modeling of proteins and their assemblies with the integrative modeling platform. , 2011, Methods in molecular biology.

[6]  R. Aebersold,et al.  Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach , 2012, Proceedings of the National Academy of Sciences.

[7]  A. Sali,et al.  The molecular sociology of the cell , 2007, Nature.

[8]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[9]  John D. Westbrook,et al.  The Protein Model Portal , 2008, Journal of Structural and Functional Genomics.

[10]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[11]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.

[12]  Claus A M Seidel,et al.  A toolkit and benchmark study for FRET-restrained high-precision structural modeling , 2012, Nature Methods.

[13]  Ben M. Webb,et al.  ModBase, a database of annotated comparative protein structure models and associated resources , 2013, Nucleic Acids Res..

[14]  Andrej Sali,et al.  Structure-based model of allostery predicts coupling between distant sites , 2012, Proceedings of the National Academy of Sciences.

[15]  A. Yamaguchi,et al.  Structural basis for the inhibition of bacterial multidrug exporters , 2013, Nature.

[16]  Jill Trewhella,et al.  Report of the wwPDB Small-Angle Scattering Task Force: data requirements for biomolecular modeling and the PDB. , 2013, Structure.

[17]  Anthony Nicholls,et al.  SAMPL2 challenge: prediction of solvation energies and tautomer ratios , 2010, J. Comput. Aided Mol. Des..

[18]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[19]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[20]  Johannes Söding,et al.  Fast and accurate automatic structure prediction with HHpred , 2009, Proteins.

[21]  Torsten Schwede,et al.  Assessment of template based protein structure predictions in CASP9 , 2011, Proteins.

[22]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[23]  Michael Nilges,et al.  Materials and Methods Som Text Figs. S1 to S6 References Movies S1 to S5 Inferential Structure Determination , 2022 .

[24]  D. Baker,et al.  De novo protein structure generation from incomplete chemical shift assignments , 2009, Journal of biomolecular NMR.

[25]  Mathias Jucker,et al.  The Amyloid State of Proteins in Human Diseases , 2012, Cell.

[26]  Kristian Rother,et al.  RNA tertiary structure prediction with ModeRNA , 2011, Briefings Bioinform..

[27]  Ian Sillitoe,et al.  Gene3D: a domain-based resource for comparative genomics, functional annotation and protein network analysis , 2011, Nucleic Acids Res..

[28]  A. Sali,et al.  Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Manuel C. Peitsch,et al.  Protein Modeling by E-mail , 1995, Bio/Technology.

[30]  D. Baker,et al.  Principles for designing ideal protein structures , 2012, Nature.

[31]  Mark N. Wass,et al.  Challenges for the prediction of macromolecular interactions. , 2011, Current opinion in structural biology.

[32]  Thomas A. Hopf,et al.  Three-Dimensional Structures of Membrane Proteins from Genomic Sequencing , 2012, Cell.

[33]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[34]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[35]  Joël Janin,et al.  Protein-protein docking tested in blind predictions: the CAPRI experiment. , 2010, Molecular bioSystems.

[36]  Liam J. McGuffin,et al.  The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction , 2011, Nucleic Acids Res..

[37]  E. Bradley,et al.  Performance of 3D-database molecular docking studies into homology models. , 2004, Journal of medicinal chemistry.

[38]  Dmitri I Svergun,et al.  Impact and progress in small and wide angle X-ray scattering (SAXS and WAXS). , 2013, Current opinion in structural biology.

[39]  B. Shoichet,et al.  Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. , 2003, Journal of medicinal chemistry.

[40]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[41]  Tal Yahav,et al.  Cryo-electron tomography: gaining insight into cellular processes by structural approaches. , 2011, Current opinion in structural biology.

[42]  Manuel Simon,et al.  Designed ankyrin repeat proteins (DARPins) from research to therapy. , 2012, Methods in enzymology.

[43]  TaeHyung Kim,et al.  Distinct Types of Disorder in the Human Proteome: Functional Implications for Alternative Splicing , 2013, PLoS Comput. Biol..

[44]  A Keith Dunker,et al.  Alternative splicing of intrinsically disordered regions and rewiring of protein interactions. , 2013, Current opinion in structural biology.

[45]  Tom Misteli,et al.  Functional implications of genome topology , 2013, Nature Structural &Molecular Biology.

[46]  Torsten Schwede,et al.  Automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal. , 2012, Methods in molecular biology.

[47]  Erik van Nimwegen,et al.  Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments , 2010, PLoS Comput. Biol..

[48]  Ryan H. Lilien,et al.  Efficient a Priori Identification of Drug Resistant Mutations Using Dead-End Elimination and MM-PBSA , 2012, J. Chem. Inf. Model..

[49]  Ben M. Webb,et al.  Macromolecular assembly structures by comparative modeling and electron microscopy. , 2012, Methods in molecular biology.

[50]  T. Blundell,et al.  Structural Insights into the Role of Domain Flexibility in Human DNA Ligase IV , 2012, Structure.

[51]  Damien Larivière,et al.  An inventory of the bacterial macromolecular components and their spatial organization. , 2011, FEMS microbiology reviews.

[52]  Torsten Schwede,et al.  BIOINFORMATICS Bioinformatics Advance Access published November 12, 2005 The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling , 2022 .

[53]  B. Chait,et al.  The molecular architecture of the nuclear pore complex , 2007, Nature.

[54]  Torsten Schwede,et al.  Automated comparative protein structure modeling with SWISS‐MODEL and Swiss‐PdbViewer: A historical perspective , 2009, Electrophoresis.

[55]  Klaus Schulten,et al.  Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics , 2013, Nature.

[56]  Michael K. Gilson,et al.  Screening Drug-Like Compounds by Docking to Homology Models: A Systematic Study , 2006, J. Chem. Inf. Model..

[57]  Christopher Jarzynski,et al.  Using Sequence Alignments to Predict Protein Structure and Stability With High Accuracy , 2012, 1207.2484.

[58]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[59]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[60]  Adelene Y. L. Sim,et al.  Modeling nucleic acids. , 2012, Current opinion in structural biology.

[61]  Timothy Nugent,et al.  Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis , 2012, Proceedings of the National Academy of Sciences.

[62]  M. Sternberg,et al.  Protein–protein interaction sites are hot spots for disease‐associated nonsynonymous SNPs , 2012, Human mutation.

[63]  B. Kobilka,et al.  New G-protein-coupled receptor crystal structures: insights and limitations. , 2008, Trends in pharmacological sciences.

[64]  Jeffrey Skolnick,et al.  Are predicted protein structures of any value for binding site prediction and virtual ligand screening? , 2013, Current opinion in structural biology.

[65]  Joël Janin,et al.  Protein flexibility, not disorder, is intrinsic to molecular recognition , 2013, F1000 biology reports.

[66]  R. MacKinnon,et al.  Principles of Selective Ion Transport in Channels and Pumps , 2005, Science.

[67]  Roland L Dunbrack,et al.  Outcome of a workshop on archiving structural models of biological macromolecules. , 2006, Structure.

[68]  Ruben Abagyan,et al.  Status of GPCR modeling and docking as reflected by community-wide GPCR Dock 2010 assessment. , 2011, Structure.

[69]  Louis Leung,et al.  Using a homology model of cytochrome P450 2D6 to predict substrate site of metabolism , 2010, J. Comput. Aided Mol. Des..

[70]  Thomas C. Terwilliger,et al.  The success of structural genomics , 2011, Journal of Structural and Functional Genomics.

[71]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[72]  L. Mirny,et al.  Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data , 2013, Nature Reviews Genetics.

[73]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .

[74]  A. Valencia,et al.  Emerging methods in protein co-evolution , 2013, Nature Reviews Genetics.

[75]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[76]  Randy J. Read,et al.  A New Generation of Crystallographic Validation Tools for the Protein Data Bank , 2011, Structure.

[77]  Albert C. Pan,et al.  The Dynamic Process of β2-Adrenergic Receptor Activation , 2013, Cell.

[78]  Liam J. McGuffin,et al.  The ModFOLD4 server for the quality assessment of 3D protein models , 2013, Nucleic Acids Res..

[79]  Torsten Schwede,et al.  The SWISS-MODEL Repository and associated resources , 2008, Nucleic Acids Res..

[80]  Conrad C. Huang,et al.  UCSF Chimera, MODELLER, and IMP: an integrated modeling system. , 2012, Journal of structural biology.

[81]  Seung Joong Kim,et al.  Integrative structural modeling with small angle X-ray scattering profiles , 2012, BMC Structural Biology.

[82]  Andrej Sali,et al.  Integrative Structural Biology , 2013, Science.

[83]  Ines Thiele,et al.  Three-Dimensional Structural View of the Central Metabolic Network of Thermotoga maritima , 2009, Science.

[84]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[85]  Matthew P Jacobson,et al.  Assessment of protein structure refinement in CASP9 , 2011, Proteins.

[86]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[87]  Manfred J. Sippl,et al.  Thirty years of environmental health research--and growing. , 1996, Nucleic Acids Res..

[88]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[89]  Avner Schlessinger,et al.  Ligand Discovery from a Dopamine D3 Receptor Homology Model and Crystal Structure , 2011, Nature chemical biology.

[90]  F. Collins,et al.  A vision for the future of genomics research , 2003, Nature.

[91]  Ilya A Vakser,et al.  Low-resolution structural modeling of protein interactome. , 2013, Current opinion in structural biology.

[92]  B. Chait,et al.  Determining the architectures of macromolecular assemblies , 2007, Nature.

[93]  Feng Ding,et al.  RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction. , 2012, RNA.

[94]  David H Mathews,et al.  RNA structure prediction: an overview of methods. , 2012, Methods in molecular biology.

[95]  B. Honig,et al.  Structure-based prediction of protein-protein interactions on a genome-wide scale , 2012, Nature.

[96]  Leszek Rychlewski,et al.  LiveBench‐8: The large‐scale, continuous assessment of automated protein structure prediction , 2005, Protein science : a publication of the Protein Society.

[97]  Anna Tramontano,et al.  Assessment of the assessment: Evaluation of the model quality estimates in CASP10 , 2014, Proteins.

[98]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[99]  Andrej Sali,et al.  Macromolecular docking restrained by a small angle X-ray scattering profile. , 2011, Journal of structural biology.

[100]  Ruedi Aebersold,et al.  Mass spectrometry supported determination of protein complex structure. , 2013, Current opinion in structural biology.

[101]  Zhengwei Zhu,et al.  Templates are available to model nearly all complexes of structurally characterized proteins , 2012, Proceedings of the National Academy of Sciences.

[102]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[103]  Z. Popovic,et al.  Crystal structure of a monomeric retroviral protease solved by protein folding game players , 2011, Nature Structural &Molecular Biology.

[104]  Andrej Sali,et al.  Assembly of macromolecular complexes by satisfaction of spatial restraints from electron microscopy images , 2012, Proceedings of the National Academy of Sciences.

[105]  Yang Zhang Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10 , 2014, Proteins.

[106]  A. Keith Dunker,et al.  The case for intrinsically disordered proteins playing contributory roles in molecular recognition without a stable 3D structure , 2013, F1000 biology reports.

[107]  Marco Biasini,et al.  Toward the estimation of the absolute quality of individual protein structure models , 2010, Bioinform..

[108]  Richard D. Smith,et al.  CSAR Benchmark Exercise 2011–2012: Evaluation of Results from Docking and Relative Ranking of Blinded Congeneric Series , 2013, J. Chem. Inf. Model..

[109]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[110]  S. Costanzi Modeling G protein-coupled receptors and their interactions with ligands. , 2013, Current opinion in structural biology.

[111]  M. Sattler,et al.  Combining NMR and small angle X-ray and neutron scattering in the structural analysis of a ternary protein-RNA complex , 2013, Journal of Biomolecular NMR.

[112]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[113]  S. Sugiyama,et al.  The genome folding mechanism in yeast. , 2013, Journal of biochemistry.

[114]  J. Dekker,et al.  The hierarchy of the 3D genome. , 2013, Molecular cell.

[115]  Qifang Xu,et al.  The protein common interface database (ProtCID)—a comprehensive database of interactions of homologous proteins in multiple crystal forms , 2010, Nucleic Acids Res..

[116]  D. Baker,et al.  Computational Design of Self-Assembling Protein Nanomaterials with Atomic Level Accuracy , 2012, Science.

[117]  R. Russell,et al.  Structural systems biology: modelling protein interactions , 2006, Nature Reviews Molecular Cell Biology.

[118]  P. Kastritis,et al.  On the binding affinity of macromolecular interactions: daring to ask why proteins interact , 2013, Journal of The Royal Society Interface.

[119]  Damien Larivière,et al.  Easy DNA Modeling and More with GraphiteLifeExplorer , 2013, PloS one.

[120]  Jason T. Stevens,et al.  Structure-based design of non-natural amino-acid inhibitors of amyloid fibril formation , 2011, Nature.

[121]  Laura S Itzhaki,et al.  Tandem-repeat proteins: regularity plus modularity equals design-ability. , 2013, Current opinion in structural biology.

[122]  Roland L. Dunbrack,et al.  Prediction of phenotypes of missense mutations in human proteins from biological assemblies , 2013, Proteins.

[123]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[124]  Ben M. Webb,et al.  Putting the Pieces Together: Integrative Modeling Platform Software for Structure Determination of Macromolecular Assemblies , 2012, PLoS biology.

[125]  Cole Trapnell,et al.  Modeling and automation of sequencing-based characterization of RNA structure , 2011, Proceedings of the National Academy of Sciences.

[126]  David Baker,et al.  Computational design of novel protein binders and experimental affinity maturation. , 2013, Methods in enzymology.

[127]  Randy J. Read,et al.  Improved molecular replacement by density- and energy-guided protein structure optimization , 2011, Nature.

[128]  F. Crick,et al.  Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid , 1953, Nature.

[129]  Hong Liang,et al.  A method for integrative structure determination of protein-protein complexes , 2012, Bioinform..

[130]  Frank Alber,et al.  Integrating diverse data for structure determination of macromolecular assemblies. , 2008, Annual review of biochemistry.

[131]  J. Briggs Structural biology in situ--the potential of subtomogram averaging. , 2013, Current opinion in structural biology.

[132]  A. Frangakis,et al.  The three-dimensional molecular structure of the desmosomal plaque , 2011, Proceedings of the National Academy of Sciences.

[133]  Yue Weng,et al.  Structure of P-Glycoprotein Reveals a Molecular Basis for Poly-Specific Drug Binding , 2009, Science.

[134]  P. Aloy,et al.  Three-dimensional modeling of protein interactions and complexes is going 'omics. , 2011, Current opinion in structural biology.

[135]  Marcin J. Skwark,et al.  Improved predictions by Pcons.net using multiple templates , 2011, Bioinform..

[136]  John A Tainer,et al.  Super-resolution in solution X-ray scattering and its applications to structural systems biology. , 2013, Annual review of biophysics.

[137]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[138]  R. Nussinov,et al.  Dynamic allostery: linkers are not merely flexible. , 2011, Structure.