Distributions of experimental protein structures on coarse-grained free energy landscapes.

Predicting conformational changes of proteins is needed in order to fully comprehend functional mechanisms. With the large number of available structures in sets of related proteins, it is now possible to directly visualize the clusters of conformations and their conformational transitions through the use of principal component analysis. The most striking observation about the distributions of the structures along the principal components is their highly non-uniform distributions. In this work, we use principal component analysis of experimental structures of 50 diverse proteins to extract the most important directions of their motions, sample structures along these directions, and estimate their free energy landscapes by combining knowledge-based potentials and entropy computed from elastic network models. When these resulting motions are visualized upon their coarse-grained free energy landscapes, the basis for conformational pathways becomes readily apparent. Using three well-studied proteins, T4 lysozyme, serum albumin, and sarco-endoplasmic reticular Ca(2+) adenosine triphosphatase (SERCA), as examples, we show that such free energy landscapes of conformational changes provide meaningful insights into the functional dynamics and suggest transition pathways between different conformational states. As a further example, we also show that Monte Carlo simulations on the coarse-grained landscape of HIV-1 protease can directly yield pathways for force-driven conformational changes.

[1]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[2]  R. Ornstein,et al.  Investigation of domain motions in bacteriophage T4 lysozyme. , 1994, Journal of biomolecular structure & dynamics.

[3]  Daisuke Kihara,et al.  Quality assessment of protein structure models. , 2009, Current protein & peptide science.

[4]  Joanna Trylska,et al.  Binding Pathways of Ligands to HIV‐1 Protease: Coarse‐grained and Atomistic Simulations , 2007, Chemical biology & drug design.

[5]  Andrzej Kloczkowski,et al.  MAVENs: Motion analysis and visualization of elastic networks and structural ensembles , 2011, BMC Bioinformatics.

[6]  Alexander Tropsha,et al.  Development of a four-body statistical pseudo-potential to discriminate native from non-native protein conformations , 2003, Bioinform..

[7]  R L Jernigan,et al.  Short‐range conformational energies, secondary structure propensities, and recognition of correct sequence‐structure matches , 1997, Proteins.

[8]  M. Inouye,et al.  Chemical studies on the enzymatic specificity of goose egg white lysozyme. , 1973, The Journal of biological chemistry.

[9]  P. Senet,et al.  Reconstructing the free-energy landscape of Met-enkephalin using dihedral principal component analysis and well-tempered metadynamics. , 2012, The Journal of chemical physics.

[10]  P. Nguyen,et al.  Energy landscape of a small peptide revealed by dihedral angle principal component analysis , 2004, Proteins.

[11]  M. Karplus,et al.  Collective motions in proteins: A covariance analysis of atomic fluctuations in molecular dynamics and normal mode simulations , 1991, Proteins.

[12]  A. Fleming On a Remarkable Bacteriolytic Element Found in Tissues and Secretions , 1922 .

[13]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[14]  Tirion,et al.  Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. , 1996, Physical review letters.

[15]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[16]  P. Babbitt,et al.  Enzyme (re)design: lessons from natural evolution and computation. , 2009, Current opinion in chemical biology.

[17]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[18]  D. Maclennan,et al.  Functional consequences of alterations to amino acids located in the hinge domain of the Ca(2+)-ATPase of sarcoplasmic reticulum. , 1991, Journal of Biological Chemistry.

[19]  K. Fidelis,et al.  Protein structure prediction and model quality assessment. , 2009, Drug discovery today.

[20]  Ruth Nussinov,et al.  A second molecular biology revolution? The energy landscapes of biomolecular function. , 2014, Physical chemistry chemical physics : PCCP.

[21]  C L Brooks,et al.  Taking a Walk on a Landscape , 2001, Science.

[22]  Arieh Warshel,et al.  Bicycle-pedal model for the first step in the vision process , 1976, Nature.

[23]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[24]  A. Carriquiry,et al.  Close correspondence between the motions from principal component analysis of multiple HIV-1 protease structures and elastic network modes. , 2008, Structure.

[25]  B. Matthews,et al.  The three dimensional structure of the lysozyme from bacteriophage T4. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[26]  H A Scheraga,et al.  Molecular simulation study of cooperativity in hydrophobic association , 2000, Protein science : a publication of the Protein Society.

[27]  David W Ritchie,et al.  Recent progress and future directions in protein-protein docking. , 2008, Current protein & peptide science.

[28]  Sumudu P. Leelananda,et al.  Multibody coarse‐grained potentials for native structure recognition and quality assessment of protein models , 2011, Proteins.

[29]  D. Clarke,et al.  Functional consequences of mutations of conserved amino acids in the beta-strand domain of the Ca2(+)-ATPase of sarcoplasmic reticulum. , 1990, Journal of Biological Chemistry.

[30]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[31]  R. Jernigan,et al.  Global ribosome motions revealed with elastic network model. , 2004, Journal of structural biology.

[32]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[33]  Ilya A Vakser,et al.  Predicting 3D structures of protein-protein complexes. , 2008, Current pharmaceutical biotechnology.

[34]  B. Matthews,et al.  A mutant T4 lysozyme displays five different crystal conformations , 1990, Nature.

[35]  Ivet Bahar,et al.  Adaptability of protein structures to enable functional interactions and evolutionary implications. , 2015, Current opinion in structural biology.

[36]  A. Liwo,et al.  Principal component analysis for protein folding dynamics. , 2009, Journal of molecular biology.

[37]  Robert L. Jernigan,et al.  A Computational Investigation on the Connection between Dynamics Properties of Ribosomal Proteins and Ribosome Assembly , 2012, PLoS Comput. Biol..

[38]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[39]  K A Dill,et al.  Ligand binding to proteins: The binding landscape model , 1997, Protein science : a publication of the Protein Society.

[40]  M. Karplus,et al.  The hinge-bending mode in lysozyme , 1976, Nature.

[41]  Tanja Kortemme,et al.  Computer-aided design of functional protein interactions. , 2009, Nature chemical biology.

[42]  L. Johnson,et al.  Structure of Some Crystalline Lysozyme-Inhibitor Complexes Determined by X-Ray Analysis At 6 Å Resolution , 1965, Nature.

[43]  N. Green,et al.  The Mechanism of Ca2+ Transport by Sarco(Endo)plasmic Reticulum Ca2+-ATPases* , 1997, The Journal of Biological Chemistry.

[44]  D. Leitner,et al.  Principal component analysis of fast-folding λ-repressor mutants , 2006 .

[45]  A. Amadei,et al.  On the convergence of the conformational coordinates basis set obtained by the essential dynamics analysis of proteins' molecular dynamics simulations , 1999, Proteins.

[46]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[47]  B. Matthews,et al.  Protein flexibility and adaptability seen in 25 crystal forms of T4 lysozyme. , 1995, Journal of molecular biology.

[48]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[49]  R. Jernigan,et al.  The ribosome structure controls and directs mRNA entry, translocation and exit dynamics , 2008, Physical biology.

[50]  J. Andrew McCammon,et al.  A coarse grained model for the dynamics of flap opening in HIV-1 protease , 2005 .

[51]  P W Howe,et al.  Principal components analysis of protein structure ensembles calculated using NMR data , 2001, Journal of biomolecular NMR.

[52]  Lubomír Rulísek,et al.  Molecular analysis of the HIV-1 resistance development: enzymatic activities, crystal structures, and thermodynamics of nelfinavir-resistant HIV protease mutants. , 2007, Journal of molecular biology.

[53]  Gennady M Verkhivker,et al.  The Energy Landscape Analysis of Cancer Mutations in Protein Kinases , 2011, PloS one.

[54]  T. Woolf,et al.  The role of domain: Domain interactions versus domain: Water interactions in the coarse‐grained simulations of the E1P to E2P transitions in Ca‐ATPase (SERCA) , 2012, Proteins.

[55]  Ivet Bahar,et al.  Principal component analysis of native ensembles of biomolecular structures (PCA_NEST): insights into functional dynamics , 2009, Bioinform..

[56]  I. Bahar,et al.  Pre‐existing soft modes of motion uniquely defined by native contact topology facilitate ligand binding to proteins , 2011, Protein science : a publication of the Protein Society.

[57]  Joanna Trylska,et al.  Flap opening dynamics in HIV-1 protease explored with a coarse-grained model. , 2007, Journal of structural biology.

[58]  A. Liwo,et al.  Molecular simulation study of cooperativity in hydrophobic association: clusters of four hydrophobic particles. , 2003, Biophysical chemistry.

[59]  Paul M. Harrison,et al.  The Landscape of the Prion Protein's Structural Response to Mutation Revealed by Principal Component Analysis of Multiple NMR Ensembles , 2012, PLoS Comput. Biol..

[60]  Guang Song,et al.  How well can we understand large-scale protein motions using normal modes of elastic network models? , 2007, Biophysical journal.

[61]  B. Matthews,et al.  Structure of a hinge-bending bacteriophage T4 lysozyme mutant, Ile3-->Pro. , 1993, Journal of molecular biology.

[62]  Andrzej Kloczkowski,et al.  Free energies for coarse-grained proteins by integrating multibody statistical contact potentials with entropies from elastic network models , 2011, Journal of Structural and Functional Genomics.

[63]  G. Chirikjian,et al.  An elastic network model of HK97 capsid maturation. , 2003, Journal of structural biology.

[64]  Margaret S. Cheung,et al.  The energy landscape for protein folding and possible connections to function , 2004 .

[65]  D. Clarke,et al.  Functional consequences of alterations to polar amino acids located in the transmembrane domain of the Ca2(+)-ATPase of sarcoplasmic reticulum. , 1990, The Journal of biological chemistry.

[66]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[67]  Andrzej Kloczkowski,et al.  Four‐body contact potentials derived from two protein datasets to discriminate native structures from decoys , 2007, Proteins.

[68]  P. Wolynes,et al.  Symmetry and the energy landscapes of biomolecules. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[69]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[70]  U H Hansmann,et al.  New Monte Carlo algorithms for protein folding. , 1999, Current opinion in structural biology.

[71]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[72]  P. Wolynes,et al.  The energy landscapes and motions of proteins. , 1991, Science.

[73]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[74]  Gerhard Stock,et al.  Free-energy landscape of RNA hairpins constructed via dihedral angle principal component analysis. , 2009, The journal of physical chemistry. B.

[75]  K. Kobayashi,et al.  Crystal structure of human serum albumin at 2.5 A resolution. , 1999, Protein engineering.

[76]  M. Navia,et al.  Three-dimensional structure of aspartyl protease from human immunodeficiency virus HIV-1 , 1989, Nature.

[77]  Ali Rana Atilgan,et al.  Perturbation-Response Scanning Reveals Ligand Entry-Exit Mechanisms of Ferric Binding Protein , 2009, PLoS Comput. Biol..

[78]  J. Ben Rosen,et al.  Protein Structure and Energy Landscape Dependence on Sequence Using a Continuous Energy Function , 1997, J. Comput. Biol..

[79]  N Go,et al.  Normal mode analysis of human lysozyme: Study of the relative motion of the two domains and characterization of the harmonic motion , 1991, Proteins.

[80]  I. Bahar,et al.  Global dynamics of proteins: bridging between structure and function. , 2010, Annual review of biophysics.

[81]  Steven Hayward,et al.  Normal modes and essential dynamics. , 2008, Methods in molecular biology.

[82]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[83]  E. Freire,et al.  Multidrug resistance to HIV-1 protease inhibition requires cooperative coupling between distal mutations. , 2003, Biochemistry.

[84]  S. Teichmann,et al.  Parallel dynamics and evolution: Protein conformational fluctuations and assembly reflect evolutionary changes in sequence and structure , 2014, BioEssays : news and reviews in molecular, cellular and developmental biology.

[85]  D C Carter,et al.  Three-dimensional structure of human serum albumin. , 1989, Science.

[86]  H. Sasabe,et al.  Three-dimensional cryo-electron microscopy of the calcium ion pump in the sarcoplasmic reticulum membrane , 1993, Nature.

[87]  Dima Kozakov,et al.  Convergence and combination of methods in protein-protein docking. , 2009, Current opinion in structural biology.

[88]  Burkhard Rost,et al.  Evaluation of template‐based models in CASP8 with standard measures , 2009, Proteins.

[89]  Alexander Schug,et al.  From protein folding to protein function and biomolecular binding by energy landscape theory. , 2010, Current opinion in pharmacology.

[90]  Joanna Trylska,et al.  HIV-1 protease substrate binding and product release pathways explored with coarse-grained molecular dynamics. , 2007, Biophysical journal.

[91]  R. Jernigan,et al.  Collective dynamics of the ribosomal tunnel revealed by elastic network modeling , 2009, Proteins.

[92]  A. Bartesaghi,et al.  2.2 Å resolution cryo-EM structure of β-galactosidase in complex with a cell-permeant inhibitor , 2015, Science.

[93]  Leo S. D. Caves,et al.  Bio3d: an R package for the comparative analysis of protein structures , 2006, Bioinform..

[94]  P. Munson,et al.  Statistical significance of hierarchical multi‐body potentials based on Delaunay tessellation and their application in sequence‐structure alignment , 1997, Protein science : a publication of the Protein Society.

[95]  A. Konagurthu,et al.  MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.

[96]  R. Hegger,et al.  Dihedral angle principal component analysis of molecular dynamics simulations. , 2007, The Journal of chemical physics.

[97]  P. Wolynes Energy landscapes and solved protein–folding problems , 2004, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[98]  Daniel J. Muller,et al.  Point mutations in membrane proteins reshape energy landscape and populate different unfolding pathways. , 2008, Journal of molecular biology.

[99]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[100]  Ivet Bahar,et al.  ProDy: Protein Dynamics Inferred from Theory and Experiments , 2011, Bioinform..

[101]  D. Carter,et al.  Atomic structure and chemistry of human serum albumin , 1992, Nature.

[102]  R. Jernigan,et al.  Anisotropy of fluctuation dynamics of proteins with an elastic network model. , 2001, Biophysical journal.

[103]  E. Leberer,et al.  Functional consequences of glutamate, aspartate, glutamine, and asparagine mutations in the stalk sector of the Ca2+-ATPase of sarcoplasmic reticulum. , 1989, The Journal of biological chemistry.

[104]  Andrzej Kloczkowski,et al.  Combining statistical potentials with dynamics-based entropies improves selection from protein decoys and docking poses. , 2012, The journal of physical chemistry. B.

[105]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[106]  D. F. Koenig,et al.  Structure of Hen Egg-White Lysozyme: A Three-dimensional Fourier Synthesis at 2 Å Resolution , 1965, Nature.

[107]  Francesco Luigi Gervasio,et al.  Effects of oncogenic mutations on the conformational free-energy landscape of EGFR kinase , 2013, Proceedings of the National Academy of Sciences.

[108]  C. Misquitta,et al.  Sarco/endoplasmic reticulum Ca2+ (SERCA)-pumps: link to heart beats and calcium waves. , 1999, Cell calcium.

[109]  M. Nakasako,et al.  Crystal structure of the calcium pump of sarcoplasmic reticulum at 2.6 Å resolution , 2000, Nature.

[110]  H. Berendsen,et al.  Domain motions in bacteriophage T4 lysozyme: A comparison between molecular dynamics and crystallographic data , 1998, Proteins.

[111]  A. Kidera,et al.  Protein structural change upon ligand binding: linear response theory. , 2005, Physical review letters.

[112]  D. Thirumalai,et al.  Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes , 2008, Protein science : a publication of the Protein Society.

[113]  K. Dill,et al.  Protein folding in the landscape perspective: Chevron plots and non‐arrhenius kinetics , 1998, Proteins.

[114]  F. Rüker,et al.  The Three Recombinant Domains of Human Serum Albumin , 1999, The Journal of Biological Chemistry.

[115]  Oliver F. Lange,et al.  Recognition Dynamics Up to Microseconds Revealed from an RDC-Derived Ubiquitin Ensemble in Solution , 2008, Science.

[116]  Joanna Trylska,et al.  Gated binding of ligands to HIV-1 protease: Brownian dynamics simulations in a coarse-grained model. , 2006, Biophysical journal.

[117]  Andrzej Kloczkowski,et al.  Potentials 'R'Us web-server for protein energy estimations with coarse-grained knowledge-based potentials , 2010, BMC Bioinformatics.

[118]  A. Atilgan,et al.  Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. , 1997, Folding & design.

[119]  Adam Liwo,et al.  How adequate are one- and two-dimensional free energy landscapes for protein folding dynamics? , 2009, Physical review letters.

[120]  R. Ornstein,et al.  Protein hinge bending as seen in molecular dynamics simulations of native and M61 mutant T4 lysozymes. , 1997, Biopolymers.

[121]  Y. Sanejouand,et al.  Conformational change of proteins arising from normal mode calculations. , 2001, Protein engineering.

[122]  Lydia E. Kavraki,et al.  Understanding Protein Flexibility through Dimensionality Reduction , 2003, J. Comput. Biol..