The Rosetta all-atom energy function for macromolecular modeling and design

Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta’s success is the energy function: amodel parameterized from small molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, beta_nov15. Applying these concepts,we explain how to use Rosetta energies to identify and analyze the features of biomolecular models.Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membrane proteins, peptides containing non-canonical amino acids, carbohydrates, nucleic acids, and other macromolecules.

[1]  H. Urey,et al.  The Vibrations of Pentatonic Tetrahedral Molecules , 1931 .

[2]  A. Warshel,et al.  Consistent Force Field for Calculations of Conformations, Vibrational Spectra, and Enthalpies of Cycloalkane and n‐Alkane Molecules , 1968 .

[3]  M. Levitt,et al.  Refinement of protein conformations using a macromolecular energy minimization procedure. , 1969, Journal of molecular biology.

[4]  A. Warshel,et al.  Consistent Force Field Calculations. II. Crystal Structures, Sublimation Energies, Molecular and Lattice Vibrations, Molecular Conformations, and Enthalpies of Alkanes , 1970 .

[5]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[6]  M. Levitt,et al.  Energy refinement of hen egg-white lysozyme. , 1974, Journal of molecular biology.

[7]  H. Scheraga,et al.  Model of protein folding: inclusion of short-, medium-, and long-range interactions. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Wodak,et al.  Hemoglobin interaction in sickle cell fibers. I: Theoretical approaches to the molecular contacts. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[9]  M. Karplus,et al.  Sidechain torsional potentials and motion of amino acids in porteins: bovine pancreatic trypsin inhibitor. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[10]  H. Scheraga,et al.  Model of protein folding: incorporation of a one-dimensional short-range (Ising) model into a three-dimensional model. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[12]  K. Takano ON SOLUTION OF , 1983 .

[13]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[14]  A. Warshel,et al.  Calculations of electrostatic interactions in biological systems and in solutions , 1984, Quarterly Reviews of Biophysics.

[15]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[16]  A. Kolinski,et al.  Simulations of the Folding of a Globular Protein , 1990, Science.

[17]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[18]  S. L. Mayo,et al.  DREIDING: A generic force field for molecular simulations , 1990 .

[19]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[20]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[21]  J. Thornton,et al.  Influence of proline residues on protein conformation. , 1991, Journal of molecular biology.

[22]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[23]  D. Eisenberg,et al.  Three-dimensional profiles from residue-pair preferences: identification of sequences with beta/alpha-barrel fold. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[24]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[25]  A. Finkelstein,et al.  Why do protein architectures have boltzmann‐like statistics? , 1995, Proteins.

[26]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[27]  M. Karplus,et al.  Simulation of activation free energies in molecular systems , 1996 .

[28]  J. S. Ivey,et al.  Nelder-Mead simplex modifications for simulation optimization , 1996 .

[29]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[30]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[31]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[32]  S. L. Mayo,et al.  Probing the role of packing specificity in protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[33]  H. Sun,et al.  COMPASS: An ab Initio Force-Field Optimized for Condensed-Phase ApplicationsOverview with Details on Alkane and Benzene Compounds , 1998 .

[34]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[35]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.

[36]  D. Case,et al.  Generalized born models of macromolecular solvation effects. , 2000, Annual review of physical chemistry.

[37]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[38]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[39]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[40]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[41]  B. Stoddard,et al.  Design, activity, and structure of a highly specific artificial endonuclease. , 2002, Molecular cell.

[42]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[43]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[44]  D. Shortle Propensities, probabilities, and the Boltzmann hypothesis , 2003, Protein science : a publication of the Protein Society.

[45]  T. Lazaridis Effective energy function for proteins in lipid membranes , 2003, Proteins.

[46]  Bosco K. Ho,et al.  Revisiting the Ramachandran plot: Hard‐sphere repulsion, electrostatics, and H‐bonding in the α‐helix , 2003, Protein science : a publication of the Protein Society.

[47]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[48]  Jeffrey J. Gray,et al.  Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. , 2003, Journal of molecular biology.

[49]  D. Baker,et al.  A simple physical model for the prediction and design of protein-DNA interactions. , 2004, Journal of molecular biology.

[50]  David E. Kim,et al.  Computational Alanine Scanning of Protein-Protein Interfaces , 2004, Science's STKE.

[51]  D. Baker,et al.  Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[52]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[53]  D. Baker,et al.  Computational redesign of protein-protein interaction specificity , 2004, Nature Structural &Molecular Biology.

[54]  D. Baker,et al.  A new hydrogen-bonding potential for the design of protein-RNA interactions predicts specific contacts and discriminates decoys. , 2004, Nucleic acids research.

[55]  T. Oliphant,et al.  Structural basis of West Nile virus neutralization by a therapeutic antibody , 2005, Nature.

[56]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[57]  D. Baker,et al.  Multipass membrane protein structure prediction using Rosetta , 2005, Proteins.

[58]  David Baker,et al.  Protein-protein docking with backbone flexibility. , 2007, Journal of molecular biology.

[59]  Arieh Warshel,et al.  Polarizable Force Fields:  History, Test Cases, and Prospects. , 2007, Journal of chemical theory and computation.

[60]  D. Baker,et al.  Toward high-resolution prediction and design of transmembrane helical protein structures , 2007, Proceedings of the National Academy of Sciences.

[61]  D. Baker,et al.  Automated de novo prediction of native-like RNA tertiary structures , 2007, Proceedings of the National Academy of Sciences.

[62]  A. Keating,et al.  Computing van der Waals energies in the context of the rotamer approximation , 2007, Proteins.

[63]  B. Tidor,et al.  Computational design and experimental study of tighter binding peptides to an inactivated mutant of HIV‐1 protease , 2008, Proteins.

[64]  B. Kuhlman,et al.  Using quantum mechanics to improve estimates of amino acid side chain rotamer energies , 2007, Proteins.

[65]  E. Coutsias,et al.  Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling , 2009, Nature Methods.

[66]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[67]  Muhammad K. Haider,et al.  Hydrogen Bonds in Proteins: Role and Strength , 2010 .

[68]  Michael I. Jordan,et al.  Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model , 2010, PLoS Comput. Biol..

[69]  Jasmine L. Gallaher,et al.  Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction , 2010, Science.

[70]  Zhiping Weng,et al.  Protein–protein docking benchmark version 4.0 , 2010, Proteins.

[71]  Toward High-Resolution de Novo Structure Prediction for Small , 2010 .

[72]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..

[73]  Jeffrey J. Gray,et al.  De novo design of peptide-calcite biomineralization systems. , 2010, Journal of the American Chemical Society.

[74]  D. Baker,et al.  Atomic accuracy in predicting and designing non-canonical RNA structure , 2010, Nature Methods.

[75]  D. Baker,et al.  Alternate states of proteins revealed by detailed energy landscape mapping. , 2011, Journal of molecular biology.

[76]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[77]  Brian D. Weitzner,et al.  Benchmarking and Analysis of Protein Docking Performance in Rosetta v3.2 , 2011, PloS one.

[78]  D. Baker,et al.  Role of conformational sampling in computing mutation‐induced changes in protein structure and stability , 2011, Proteins.

[79]  Rhiju Das,et al.  An enumerative stepwise ansatz enables atomic-accuracy RNA loop modeling , 2011, Proceedings of the National Academy of Sciences.

[80]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[81]  Roland L. Dunbrack,et al.  A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. , 2011, Structure.

[82]  A. Michaelides,et al.  Quantum nature of the hydrogen bond , 2011, Proceedings of the National Academy of Sciences.

[83]  P. Bradley,et al.  Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers , 2011, Nucleic acids research.

[84]  David Baker,et al.  Algorithm discovery by protein folding game players , 2011, Proceedings of the National Academy of Sciences.

[85]  Roland L. Dunbrack,et al.  Nonplanar peptide bonds in proteins are common and conserved but not biased toward active sites , 2011, Proceedings of the National Academy of Sciences.

[86]  D. Baker,et al.  Structure-guided forcefield optimization , 2011, Proteins.

[87]  M. Bansal,et al.  Biomolecular Forms and Functions:A Celebration of 50 Years of the Ramachandran Map , 2012 .

[88]  Jeffrey J. Gray,et al.  Rapid calculation of protein pKa values using Rosetta. , 2012, Biophysical journal.

[89]  Eun Jung Choi,et al.  Incorporation of Noncanonical Amino Acids into Rosetta and Use in Computational Protein-Peptide Interface Design , 2012, PloS one.

[90]  Jens Meiler,et al.  Rosetta Ligand docking with flexible XML protocols. , 2012, Methods in molecular biology.

[91]  Summer B. Thyme,et al.  Improved modeling of side-chain--base interactions and plasticity in protein--DNA interface design. , 2012, Journal of molecular biology.

[92]  Andrew Watkins,et al.  Adding Diverse Noncanonical Backbones to Rosetta: Enabling Peptidomimetic Design , 2013, PloS one.

[93]  Jack Snoeyink,et al.  Scientific benchmarks for guiding macromolecular energy function improvement. , 2013, Methods in enzymology.

[94]  David Baker,et al.  High-resolution comparative modeling with RosettaCM. , 2013, Structure.

[95]  J. Richardson,et al.  “THE PLOT” THICKENS: MORE DATA, MORE DIMENSIONS, MORE USES , 2013 .

[96]  David Baker,et al.  Proof of principle for epitope-focused vaccine design , 2014, Nature.

[97]  Bethany Lachele Foley,et al.  Importance of ligand conformational energies in carbohydrate docking: Sorting the wheat from the chaff , 2014, J. Comput. Chem..

[98]  David Baker,et al.  Accurate design of co-assembling multi-component protein nanomaterials , 2014, Nature.

[99]  Timothy W. Craven,et al.  A Rotamer Library to Enable Modeling and Design of Peptoid Foldamers , 2014, Journal of the American Chemical Society.

[100]  Krishna Praneeth Kilambi,et al.  Protein-Protein Docking with Dynamic Residue Protonation States , 2014, PLoS Comput. Biol..

[101]  D. Baker,et al.  Relaxation of backbone bond geometry improves protein energy landscape modeling , 2014, Protein science : a publication of the Protein Society.

[102]  Adam P. Joyce,et al.  Structure-based modeling of protein: DNA specificity. , 2015, Briefings in functional genomics.

[103]  D. Baker,et al.  Engineering of Kuma030: A Gliadin Peptidase That Rapidly Degrades Immunogenic Gliadin Peptides in Gastric Conditions. , 2015, Journal of the American Chemical Society.

[104]  Kyle A. Barlow,et al.  A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design , 2015, PloS one.

[105]  Brian D. Weitzner,et al.  An Integrated Framework Advancing Membrane Protein Modeling and Design , 2015, PLoS Comput. Biol..

[106]  Matthew J. O’Meara,et al.  Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with Rosetta. , 2015, Journal of chemical theory and computation.

[107]  Matthias J. Brunner,et al.  Atomic accuracy models from 4.5 Å cryo-electron microscopy data with density-guided iterative local refinement , 2015, Nature Methods.

[108]  David Baker,et al.  Accurate de novo design of hyperstable constrained peptides , 2016, Nature.

[109]  Rhiju Das,et al.  Blind tests of RNA nearest-neighbor energy prediction , 2016, Proceedings of the National Academy of Sciences.

[110]  F. Dimaio,et al.  Improving hybrid statistical and physical forcefields through local structure enumeration , 2016, Protein science : a publication of the Protein Society.

[111]  David E. Kim,et al.  Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules. , 2016, Journal of chemical theory and computation.

[112]  R. Woods,et al.  Vina-Carb: Improving Glycosidic Angles during Carbohydrate Docking. , 2016, Journal of chemical theory and computation.

[113]  Jared Adolf-Bryfogle,et al.  Residue‐centric modeling and design of saccharide and glycoconjugate structures , 2017, J. Comput. Chem..