Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis

Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57–70 and IEICE Trans Inf Syst 2005;E88‐D:2242–2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C‐terminal β‐hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low‐dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of β‐hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well‐structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems. Proteins 2006. © 2006 Wiley‐Liss, Inc.

[1]  Juha Karhunen,et al.  Generalizations of principal component analysis, optimization problems, and neural networks , 1995, Neural Networks.

[2]  M. Karplus,et al.  Collective motions in proteins: A covariance analysis of atomic fluctuations in molecular dynamics and normal mode simulations , 1991, Proteins.

[3]  Jeremy C. Smith,et al.  Low-temperature protein dynamics: a simulation analysis of interprotein vibrations and the boson peak at 150 k. , 2006, Journal of the American Chemical Society.

[4]  Chin-Kun Hu,et al.  Free energy landscape and folding mechanism of a β‐hairpin in explicit water: A replica exchange molecular dynamics study , 2005, Proteins.

[5]  Ahmet Palazoglu,et al.  Folding Dynamics of Proteins from Denatured to Native State: Principal Component Analysis , 2004, J. Comput. Biol..

[6]  P. Bolhuis Transition-path sampling of β-hairpin folding , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  P. Nguyen,et al.  Energy landscape of a small peptide revealed by dihedral angle principal component analysis , 2004, Proteins.

[8]  G. Ciccotti,et al.  Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes , 1977 .

[9]  J. Onuchic,et al.  Theory of Protein Folding This Review Comes from a Themed Issue on Folding and Binding Edited Basic Concepts Perfect Funnel Landscapes and Common Features of Folding Mechanisms , 2022 .

[10]  Seokmin Shin,et al.  Two-Dimensional Correlation Analysis of Peptide Unfolding: Molecular Dynamics Simulations of β Hairpins , 2002 .

[11]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[12]  Yuko Okamoto,et al.  Secondary-structure preferences of force fields for proteins evaluated by generalized-ensemble simulations , 2004 .

[13]  Julia M. Goodfellow,et al.  Simulated dynamics and biological macromolecules , 2003 .

[14]  Shigenori Tanaka,et al.  Functionally relevant protein motions: extracting basin-specific collective coordinates from molecular dynamics trajectories. , 2005, The Journal of chemical physics.

[15]  M. Gruebele Protein folding: the free energy surface. , 2002, Current opinion in structural biology.

[16]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[17]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[18]  Andrea Amadei,et al.  A molecular dynamics study of the 41‐56 β‐hairpin from B1 domain of protein G , 1999, Protein science : a publication of the Protein Society.

[19]  J. Onuchic,et al.  Navigating the folding routes , 1995, Science.

[20]  Jeremy C Smith,et al.  Temperature-dependent protein dynamics: a simulation-based probabilistic diffusion-vibration Langevin description. , 2006, The journal of physical chemistry. B.

[21]  R. Zhou Exploring the protein folding free energy landscape: coupling replica exchange method with P3ME/RESPA algorithm. , 2004, Journal of molecular graphics & modelling.

[22]  L Serrano,et al.  Folding of protein G B1 domain studied by the conformational characterization of fragments comprising its secondary structure elements. , 1995, European journal of biochemistry.

[23]  B. Berne,et al.  The free energy landscape for β hairpin folding in explicit water , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Juha Karhunen,et al.  Principal component neural networks — Theory and applications , 1998, Pattern Analysis and Applications.

[25]  H. Berendsen,et al.  Molecular dynamics with coupling to an external bath , 1984 .

[26]  D. Wales,et al.  From Topographies to Dynamics on Multidimensional Potential Energy Surfaces of Atomic Clusters , 1996, Science.

[27]  H. Berendsen,et al.  Interaction Models for Water in Relation to Protein Hydration , 1981 .

[28]  Y. Sugita,et al.  Comparisons of force fields for proteins by generalized-ensemble simulations , 2004 .

[29]  N. Go,et al.  Effect of solvent on collective motions in globular protein. , 1993, Journal of molecular biology.

[30]  V. Muñoz,et al.  Folding dynamics and mechanism of β-hairpin formation , 1997, Nature.

[31]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[32]  R Nussinov,et al.  Molecular dynamics simulations of a beta-hairpin fragment of protein G: balance between side-chain and backbone forces. , 2000, Journal of molecular biology.

[33]  Ryo Saegusa,et al.  Nonlinear principal component analysis to preserve the order of principal components , 2003, Neurocomputing.

[34]  O M Becker,et al.  Geometric versus topological clustering: An insight into conformation mapping , 1997, Proteins.

[35]  William J. Wilson,et al.  Multivariate Statistical Methods , 2005, Technometrics.

[36]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[37]  García,et al.  Large-amplitude nonlinear motions in proteins. , 1992, Physical review letters.

[38]  T. Hastie,et al.  Principal Curves , 2007 .

[39]  P. Nguyen,et al.  Structure and energy landscape of a photoswitchable peptide: A replica exchange molecular dynamics study , 2005, Proteins.

[40]  Shuji Hashimoto,et al.  A nonlinear principal component analysis on image data , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[41]  V S Pande,et al.  Molecular dynamics simulations of unfolding and refolding of a beta-hairpin fragment of protein G. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[42]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[43]  L. Serrano,et al.  A short linear peptide that folds into a native stable β-hairpin in aqueous solution , 1994, Nature Structural Biology.

[44]  Berk Hess,et al.  GROMACS 3.0: a package for molecular simulation and trajectory analysis , 2001 .

[45]  Ramanathan Gnanadesikan,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[46]  M. Karplus,et al.  Understanding beta-hairpin formation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[47]  T. Lazaridis,et al.  Understanding b-hairpin formation , 1999 .

[48]  Paul N. Mortenson,et al.  Energy landscapes: from clusters to biomolecules , 2007 .

[49]  Eric J. Sorin,et al.  β-hairpin folding simulations in atomistic detail using an implicit solvent model1 , 2001 .

[50]  K. Ikeda,et al.  Visualization of conformational distribution of short to medium size segments in globular proteins and identification of local structural motifs , 2005, Protein science : a publication of the Protein Society.

[51]  Fumio Hirata,et al.  The effects of solvent on the conformation and the collective motions of protein: normal mode analysis and molecular dynamics simulations of melittin in water and in vacuum , 1991 .

[52]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[53]  Anthony K. Felts,et al.  Free Energy Surfaces of -Hairpin and -Helical Peptides Generated by Replica Exchange Molecular Dynamics with the AGBNP Implicit Solvent Model , 2004 .

[54]  Alan E. Mark,et al.  The GROMOS96 Manual and User Guide , 1996 .

[55]  K. Sanbonmatsu,et al.  Exploring the energy landscape of a β hairpin in explicit solvent , 2001 .

[56]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .

[57]  B. Berne,et al.  Can a continuum solvent model reproduce the free energy landscape of a β-hairpin folding in water? , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[58]  M. Karplus,et al.  Hidden complexity of free energy surfaces for peptide (protein) folding. , 2004, Proceedings of the National Academy of Sciences of the United States of America.