Protein Structure Evaluation using an All-Atom Energy Based Empirical Scoring Function

Abstract Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions—energy based or statistical—have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp

[1]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[2]  S Vajda,et al.  Discrimination of near‐native protein structures from misfolded models by empirical free energy functions , 2000, Proteins.

[3]  Bhyravabhotla Jayaram,et al.  Monte Carlo Simulation Studies on the Structure of the Counterion Atmosphere of B-DNA. Variations on the Primitive Dielectric Model , 1990 .

[4]  J Moult,et al.  Predicting protein three-dimensional structure. , 1999, Current opinion in biotechnology.

[5]  Patrick J Fleming,et al.  Ab Initio Protein Folding Using LINUS , 2004, Numerical Computer Methods, Part D.

[6]  Alan E. Mark,et al.  The GROMOS96 Manual and User Guide , 1996 .

[7]  M Vendruscolo,et al.  Folding Lennard‐Jones proteins by a contact potential , 1999, Proteins.

[8]  M J Rooman,et al.  Are database-derived potentials valid for scoring both forward and inverted protein folding? , 1995, Protein engineering.

[9]  M. Levitt,et al.  Protein folding: the endgame. , 1997, Annual review of biochemistry.

[10]  K Yue,et al.  Folding proteins with a simple energy function and extensive conformational searching , 1996, Protein science : a publication of the Protein Society.

[11]  M J Sternberg,et al.  Enhancement of protein modeling by human intervention in applying the automatic programs 3D‐JIGSAW and 3D‐PSSM , 2001, Proteins.

[12]  E. Shakhnovich,et al.  Analysis of knowledge‐based protein‐ligand potentials using a self‐consistent method , 2008, Protein science : a publication of the Protein Society.

[13]  Alexander D. MacKerell Empirical force fields for biological macromolecules: Overview and issues , 2004, J. Comput. Chem..

[14]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[15]  B Jayaram,et al.  A Free Energy Based Computational Pathway from Chemical Templates to Lead Compounds: A Case Study of COX-2 Inhibitors , 2004, Journal of biomolecular structure & dynamics.

[16]  Ray Luo,et al.  Physical scoring function based on AMBER force field and Poisson–Boltzmann implicit solvent for protein structure prediction , 2004, Proteins.

[17]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[18]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[19]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[20]  Harold L. Friedman,et al.  Study of a Refined Model for Aqueous 1‐1 Electrolytes , 1971 .

[21]  Arnab Mukherjee,et al.  Probing folding free energy landscape of small proteins through minimalistic models: Folding of HP-36 and β-amyloid , 2003 .

[22]  B. Honig Protein folding: from the levinthal paradox to structure prediction. , 1999, Journal of molecular biology.

[23]  W. L. Jorgensen,et al.  Molecular dynamics simulations of the unfolding of apomyoglobin in water. , 1993, Biochemistry.

[24]  Biman Bagchi,et al.  Folding and unfolding of chicken villin headpiece: Energy landscape of a single-domain model protein , 2002 .

[25]  Arnab Mukherjee,et al.  Correlation between rate of folding, energy landscape, and topology in the folding of a model protein HP-36 , 2003 .

[26]  Charles L Brooks 100 and 50 years ago , 2002, Nature.

[27]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[28]  Peter A. Kollman,et al.  FREE ENERGY CALCULATIONS : APPLICATIONS TO CHEMICAL AND BIOCHEMICAL PHENOMENA , 1993 .

[29]  M. Karplus,et al.  Discrimination of the native from misfolded protein models with an energy function including implicit solvation. , 1999, Journal of molecular biology.

[30]  M R Lee,et al.  Free-energy calculations highlight differences in accuracy between X-ray and NMR structures and add value to protein structure prediction. , 2001, Structure.

[31]  B Jayaram,et al.  A computational pathway for bracketing native-like structures fo small alpha helical globular proteins. , 2005, Physical chemistry chemical physics : PCCP.

[32]  Adam Liwo,et al.  Development of Physics-Based Energy Functions that Predict Medium-Resolution Structures for Proteins of the α, β, and α/β Structural Classes , 2001 .

[33]  Christophe G. Lambert,et al.  ESyPred3D: Prediction of proteins 3D structures , 2002, Bioinform..

[34]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[35]  Bhyravabhotla Jayaram,et al.  Conformational preferences of phosphodiester torsion angles in dimethyl phosphate anion in free space and water: quasi-harmonic Monte Carlo and hydration shell calculations , 1988 .

[36]  J R Gunn,et al.  Computational studies of protein folding. , 1996, Annual review of biophysics and biomolecular structure.

[37]  B. McConkey,et al.  Discrimination of native protein structures using atom–atom contact scoring , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  C. V. Krishnan,et al.  Studies of hydrophobic bonding in aqueous alcohols: Enthalpy measurements and model calculations , 1973 .

[39]  Manuel C. Peitsch,et al.  Protein Modeling by E-mail , 1995, Bio/Technology.

[40]  Yong Duan,et al.  Distinguish protein decoys by Using a scoring function based on a new AMBER force field, short molecular dynamics simulations, and the generalized born solvent model , 2004, Proteins.

[41]  Yu Xia,et al.  Determination of optimal Chebyshev-expanded hydrophobic discrimination function for globular proteins , 2001, IBM J. Res. Dev..

[42]  Themis Lazaridis,et al.  Thermodynamics of protein folding: a microscopic view. , 2002, Biophysical chemistry.

[43]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[44]  J. Andrew McCammon,et al.  Free energy from simulations: Current Opinion in Structural Biology 1991, 1 : 196–200 , 1991 .

[45]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[46]  A Godzik,et al.  Knowledge-based potentials for protein folding: what can we learn from known protein structures? , 1996, Structure.

[47]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[48]  Sandeep Kumar,et al.  A hierarchical, building-block-based computational scheme for protein structure prediction , 2001, IBM J. Res. Dev..

[49]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[50]  R. Friesner,et al.  Computer modeling of protein folding: conformational and energetic analysis of reduced and detailed protein models. , 1995, Journal of molecular biology.

[51]  J. Skolnick,et al.  Finding the needle in a haystack: educing native folds from ambiguous ab initio protein structure predictions , 2001 .

[52]  M. Levitt,et al.  A comprehensive analysis of 40 blind protein structure predictions , 2002, BMC Structural Biology.

[53]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[54]  B. Jayaram,et al.  Free energy component analysis for drug design: a case study of HIV-1 protease-inhibitor binding. , 2001, Journal of medicinal chemistry.

[55]  Ryan Day,et al.  All-atom simulations of protein folding and unfolding. , 2003, Advances in protein chemistry.

[56]  W. L. Jorgensen Free energy calculations: a breakthrough for modeling organic chemistry in solution , 1989 .

[57]  D. Osguthorpe Ab initio protein folding. , 2000, Current opinion in structural biology.

[58]  Alexander D. MacKerell,et al.  Improved treatment of the protein backbone in empirical force fields. , 2004, Journal of the American Chemical Society.

[59]  Charles L. Brooks,et al.  Identifying native‐like protein structures using physics‐based potentials , 2002, J. Comput. Chem..

[60]  William R Taylor,et al.  Consensus structural models for the amino terminal domain of the retrovirus restriction gene Fv1 and the Murine Leukaemia Virus capsid proteins , 2004, BMC Structural Biology.

[61]  Yuichi Harano,et al.  Complete protein structure determination using backbone residual dipolar couplings and sidechain rotamer prediction , 2004, Journal of Structural and Functional Genomics.

[62]  Yang Zhang,et al.  Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. , 2004, Biophysical journal.

[63]  B. Honig,et al.  Free energy determinants of tertiary structure and the evaluation of protein models , 2000, Protein science : a publication of the Protein Society.

[64]  S Vajda,et al.  Empirical potentials and functions for protein folding and binding. , 1997, Current opinion in structural biology.

[65]  R. Jernigan,et al.  Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. , 1997, Journal of molecular biology.

[66]  R. H. Ritchie,et al.  Dielectric effects in biopolymers: The theory of ionic saturation revisited , 1985 .

[67]  J. Ramstein,et al.  Energetic coupling between DNA bending and base pair opening. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[68]  A. J. Hopfinger,et al.  Conformational Properties of Macromolecules , 1973 .

[69]  Nidhi Arora,et al.  Strength of hydrogen bonds in α helices , 1997 .

[70]  H. Scheraga,et al.  Model for the conformational analysis of hydrated peptides. Effect of hydration on the conformational stability of the terminally blocked residues of the 20 naturally occurring amino acids , 1979 .

[71]  Richard Bonneau,et al.  An improved protein decoy set for testing energy functions for protein structure prediction , 2003, Proteins.

[72]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[73]  Anthony K. Felts,et al.  Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all‐atom force field and the surface generalized born solvent model , 2002, Proteins.

[74]  Bhyravabhotla Jayaram,et al.  Solvation Free Energy of Biomacromolecules: Parameters for a Modified Generalized Born Model Consistent with the AMBER Force Field , 1998 .

[75]  David L. Beveridge,et al.  Free energy of an arbitrary charge distribution imbedded in coaxial cylindrical dielectric continua: application to conformational preferences of DNA in aqueous solutions , 1990 .

[76]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[77]  Eric A Welsh,et al.  ProVal: A protein‐scoring function for the selection of native and near‐native folds , 2003, Proteins.

[78]  B. Jayaram,et al.  A Binding Affinity Based Computational Pathway for Active-Site Directed Lead Molecule Design: Some Promises and Perspectives , 2005 .

[79]  E. Huang,et al.  Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. , 1999, Journal of molecular biology.

[80]  Z. Luthey-Schulten,et al.  Ab initio protein structure prediction. , 2002, Current opinion in structural biology.

[81]  A Kolinski,et al.  Correlation between knowledge‐based and detailed atomic potentials: Application to the unfolding of the GCN4 leucine zipper , 1999, Proteins.

[82]  K. Dill Folding proteins: finding a needle in a haystack , 1993 .

[83]  M. Levitt,et al.  Using a hydrophobic contact potential to evaluate native and near-native folds generated by molecular dynamics simulations. , 1996, Journal of molecular biology.

[84]  C. Brooks,et al.  From folding theories to folding proteins: a review and assessment of simulation studies of protein folding and unfolding. , 2001, Annual review of physical chemistry.

[85]  C. Brooks Viewing protein folding from many perspectives , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[86]  Qianqian Zhu,et al.  How well can we predict native contacts in proteins based on decoy structures and their energies? , 2003, Proteins.

[87]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[88]  M. Levitt,et al.  A novel approach to decoy set generation: designing a physical energy function having local minima with native structure characteristics. , 2003, Journal of molecular biology.

[89]  R. Zwanzig,et al.  Levinthal's paradox. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[90]  Eugene I Shakhnovich,et al.  Native atom types for knowledge-based potentials: application to binding energy prediction. , 2004, Journal of medicinal chemistry.

[91]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[92]  Nidhi Arora,et al.  Energetics of Base Pairs in B-DNA in Solution: An Appraisal of Potential Functions and Dielectric Treatments , 1998 .

[93]  W. Goddard,et al.  Contributions of the thymine methyl group to the specific recognition of poly- and mononucleotides: an analysis of the relative free energies of solvation of thymine and uracil. , 1994, Biochemistry.

[94]  C. Sander,et al.  Evaluation of protein models by atomic solvation preference. , 1992, Journal of molecular biology.

[95]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[96]  S. Radford,et al.  Protein folding: progress made and promises ahead. , 2000, Trends in biochemical sciences.

[97]  Bhyravabhotla Jayaram,et al.  Local dielectric environment of B-DNA in solution : Results from a 14 ns molecular dynamics trajectory , 1998 .

[98]  D. Beveridge,et al.  A MODIFICATION OF THE GENERALIZED BORN THEORY FOR IMPROVED ESTIMATES OF SOLVATION ENERGIES AND PK SHIFTS , 1998 .

[99]  A Kolinski,et al.  The protein folding problem: a biophysical enigma. , 2002, Current pharmaceutical biotechnology.

[100]  Bhyravabhotla Jayaram,et al.  Free Energy Analysis of the Conformational Preferences of A and B Forms of DNA in Solution , 1998 .

[101]  S. Radford,et al.  Protein folding mechanisms: new methods and emerging ideas. , 2000, Current opinion in structural biology.

[102]  H. Berendsen,et al.  COMPUTER-SIMULATION OF MOLECULAR-DYNAMICS - METHODOLOGY, APPLICATIONS, AND PERSPECTIVES IN CHEMISTRY , 1990 .