Learning high-dimensional reaction coordinates of fast-folding proteins using State Predictive information bottleneck and Bias Exchange Metadynamics

Biological events occurring on long timescales, such as protein folding, remain hard to capture with conventional molecular dynamics (MD) simulation. To overcome these limitations, enhanced sampling techniques can be used to sample regions of the free energy landscape separated by high energy barriers, thereby allowing to observe these rare events. However, many of these techniques require a priori knowledge of the appropriate reaction coordinates (RCs) that describe the process of interest. In recent years, Artificial Intelligence (AI) models have emerged as promising approaches to accelerate rare event sampling. However, integration of these AI methods with MD for automated learning of improved RCs is not trivial, particularly when working with undersampled trajectories and highly complex systems. In this study, we employed the State Predictive Information Bottleneck (SPIB) neural network, coupled with bias exchange metadynamics simulations (BE-metaD), to investigate the unfolding process of two proteins, chignolin and villin. By utilizing the high-dimensional RCs learned from SPIB even with poor training data, BE-metaD simulations dramatically accelerate the sampling of the unfolding process for both proteins. In addition, we compare different RCs and find that the careful selection of RCs is crucial to substantially speed up the sampling of rare events. Thus, this approach, leveraging the power of AI and enhanced sampling techniques, holds great promise for advancing our understanding of complex biological processes occurring on long timescales. TABLE OF CONTENT GRAPHIC

[1]  P. Tiwary,et al.  Enhanced Sampling with Machine Learning: A Review , 2023, Annual review of physical chemistry.

[2]  G. Stock,et al.  Toward a Benchmark for Markov State Models: The Folding of HP35. , 2023, The journal of physical chemistry letters.

[3]  P. Tiwary,et al.  Thermodynamics of Interpretation , 2022, ArXiv.

[4]  P. Setny,et al.  Granger Causality Analysis of Chignolin Folding , 2022, Journal of chemical theory and computation.

[5]  P. Tiwary,et al.  Accelerating All-Atom Simulations and Gaining Mechanistic Understanding of Biophysical Systems through State Predictive Information Bottleneck. , 2021, Journal of chemical theory and computation.

[6]  Stephanie M. Linker,et al.  Enhanced sampling without borders: on global biasing functions and how to reweight them , 2021, Physical chemistry chemical physics : PCCP.

[7]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[8]  W. Eaton Modern Kinetics and Mechanism of Protein Folding: A Retrospective , 2021, The journal of physical chemistry. B.

[9]  USA,et al.  State predictive information bottleneck. , 2020, The Journal of chemical physics.

[10]  K. Liedl,et al.  Polarizable and non-polarizable force fields: Protein folding, unfolding, and misfolding. , 2020, The Journal of chemical physics.

[11]  A. Bujotzek,et al.  Antibodies exhibit multiple paratope states influencing VH–VL domain orientations , 2020, Communications Biology.

[12]  K. Liedl,et al.  Antibody CDR loops as ensembles in solution vs. canonical clusters from X-ray structures , 2020, mAbs.

[13]  Frank Noé,et al.  Machine learning for molecular simulation , 2019, Annual review of physical chemistry.

[14]  Yi Isaac Yang,et al.  Enhanced sampling in molecular dynamics. , 2019, The Journal of chemical physics.

[15]  G. Georges,et al.  CDR-H3 loop ensemble in solution – conformational selection upon antibody binding , 2019, mAbs.

[16]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..

[17]  D. Raleigh,et al.  Heterogeneity in the Folding of Villin Headpiece Subdomain HP36. , 2018, The journal of physical chemistry. B.

[18]  Pratyush Tiwary,et al.  Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). , 2018, The Journal of chemical physics.

[19]  Martin K. Scherer,et al.  PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. , 2015, Journal of chemical theory and computation.

[20]  Massimiliano Bonomi,et al.  Efficient Sampling of High-Dimensional Free-Energy Landscapes with Parallel Bias Metadynamics. , 2015, Journal of chemical theory and computation.

[21]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[22]  C. Simmerling,et al.  ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. , 2015, Journal of chemical theory and computation.

[23]  Rafael C. Bernardi,et al.  Enhanced sampling techniques in molecular dynamics simulations of biological systems. , 2015, Biochimica et biophysica acta.

[24]  P. Barbini,et al.  Bias-Exchange Metadynamics Simulations: An Efficient Strategy for the Analysis of Conduction and Selectivity in Ion Channels. , 2015, Journal of chemical theory and computation.

[25]  M. Parrinello,et al.  A time-independent free energy estimator for metadynamics. , 2015, The journal of physical chemistry. B.

[26]  F. Jiang,et al.  Folding of fourteen small proteins with a residue-specific force field and replica-exchange molecular dynamics. , 2014, Journal of the American Chemical Society.

[27]  Massimiliano Bonomi,et al.  PLUMED 2: New feathers for an old bird , 2013, Comput. Phys. Commun..

[28]  Hisashi Okumura,et al.  Temperature and pressure denaturation of chignolin: Folding and unfolding simulation by multibaric‐multithermal molecular dynamics method , 2012, Proteins.

[29]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[30]  Christopher B. Harrison,et al.  Challenges in protein folding simulations: Timescale, representation, and analysis. , 2010, Nature physics.

[31]  Klaus R Liedl,et al.  Stabilizing of a globular protein by a highly complex water network: a molecular dynamics simulation study on factor Xa. , 2010, The journal of physical chemistry. B.

[32]  M. Parrinello,et al.  Targeting biomolecular flexibility with metadynamics. , 2010, Current opinion in structural biology.

[33]  Elena Papaleo,et al.  Free-energy landscape, principal component analysis, and structural clustering to identify representative conformations from molecular dynamics simulations: the myoglobin case. , 2009, Journal of molecular graphics & modelling.

[34]  K. Schulten,et al.  Molecular dynamics simulations of membrane channels and transporters. , 2009, Current opinion in structural biology.

[35]  E. Lindahl,et al.  Membrane proteins: molecular dynamics simulations. , 2008, Current opinion in structural biology.

[36]  A. Fersht,et al.  Combining experiment and simulation in protein folding: closing the gap for small model systems. , 2008, Current opinion in structural biology.

[37]  M. Parrinello,et al.  Well-tempered metadynamics: a smoothly converging and tunable free-energy method. , 2008, Physical review letters.

[38]  D. Kern,et al.  Dynamic personalities of proteins , 2007, Nature.

[39]  A. Laio,et al.  A bias-exchange approach to protein folding. , 2007, The journal of physical chemistry. B.

[40]  Martin Zacharias,et al.  Enhanced sampling of peptide and protein conformations using replica exchange simulations with a peptide backbone biasing‐potential , 2006, Proteins.

[41]  J. Hofrichter,et al.  Sub-microsecond protein folding. , 2006, Journal of molecular biology.

[42]  A. Laio,et al.  Equilibrium free energies from nonequilibrium metadynamics. , 2006, Physical review letters.

[43]  Regine Herbst-Irmer,et al.  High-resolution x-ray crystal structures of the villin headpiece subdomain, an ultrafast folding protein. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Shinya Honda,et al.  10 residue folded peptide designed by segment statistics. , 2004, Structure.

[45]  V. Pande,et al.  Absolute comparison of simulated and experimental protein-folding dynamics , 2002, Nature.

[46]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[47]  G. Parisi,et al.  Glassy transition in a disordered model for the RNA secondary structure. , 1999, Physical review letters.

[48]  P. Kollman,et al.  Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. , 1998, Science.

[49]  B. Tidor Molecular dynamics simulations , 1997, Current Biology.

[50]  A. Voter Hyperdynamics: Accelerated Molecular Dynamics of Infrequent Events , 1997 .

[51]  A. Voter A method for accelerating the molecular dynamics simulation of infrequent events , 1997 .

[52]  Paul T. Matsudaira,et al.  NMR structure of the 35-residue villin headpiece subdomain , 1997, Nature Structural Biology.

[53]  P S Kim,et al.  A thermostable 35-residue subdomain within villin headpiece. , 1996, Journal of molecular biology.

[54]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[55]  P. Kollman,et al.  Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models , 1992 .

[56]  Georg E. Schulz,et al.  Domain motions in proteins , 1991, Current Biology.

[57]  H. Berendsen,et al.  Molecular dynamics with coupling to an external bath , 1984 .

[58]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[59]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[60]  J. D. Doll,et al.  Generalized Langevin equation approach for atom/solid-surface scattering: General formulation for classical scattering off harmonic solids , 1976 .

[61]  Lindahl,et al.  GROMACS 2020 Manual , 2020 .

[62]  X. Daura,et al.  Unfolded state of peptides. , 2002, Advances in protein chemistry.

[63]  W. Bennett,et al.  Structural and functional aspects of domain motions in proteins. , 1984, CRC critical reviews in biochemistry.