Automatic mutual information noise omission (AMINO): generating order parameters for molecular systems

Molecular dynamics (MD) simulations generate valuable all-atom resolution trajectories of complex systems, but analyzing this high-dimensional data as well as reaching practical timescales, even with powerful supercomputers, remain open problems. As such, many specialized sampling and reaction coordinate construction methods exist that alleviate these problems. However, these methods typically don't work directly on all atomic coordinates, and still require previous knowledge of the important distinguishing features of the system, known as order parameters (OPs). Here we present AMINO, an automated method that generates such OPs by screening through a very large dictionary of OPs, such as all heavy atom contacts in a biomolecule. AMINO uses ideas from information theory to learn OPs that can then serve as an input for designing a reaction coordinate which can then be used in many enhanced sampling methods. Here we outline its key theoretical underpinnings, and apply it to systems of increasing complexity. Our applications include a problem of tremendous pharmaceutical and engineering relevance, namely, calculating the binding affinity of a protein–ligand system when all that is known is the structure of the bound system. Our calculations are performed in a human-free fashion, obtaining very accurate results compared to long unbiased MD simulations on the Anton supercomputer, but in orders of magnitude less computer time. We thus expect AMINO to be useful for the calculation of thermodynamics and kinetics in the study of diverse molecular systems.

[1]  Alessandro Laio,et al.  Estimating the intrinsic dimension of datasets by a minimal neighborhood information , 2017, Scientific Reports.

[2]  Babak Nadjar Araabi,et al.  A Hierarchical Clustering Based on Mutual Information Maximization , 2007, 2007 IEEE International Conference on Image Processing.

[3]  B. Trout,et al.  Obtaining reaction coordinates by likelihood maximization. , 2006, The Journal of chemical physics.

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Jakub Rydzewski,et al.  Promoting transparency and reproducibility in enhanced molecular simulations , 2019, Nature Methods.

[6]  M. Parrinello,et al.  From metadynamics to dynamics. , 2013, Physical review letters.

[7]  Toni Giorgino,et al.  Identification of slow molecular order parameters for Markov model construction. , 2013, The Journal of chemical physics.

[8]  Sergei V Krivov,et al.  On Reaction Coordinate Optimality. , 2013, Journal of chemical theory and computation.

[9]  B. Berne,et al.  Spectral gap optimization of order parameters for sampling complex molecular systems , 2015, Proceedings of the National Academy of Sciences.

[10]  Pratyush Tiwary,et al.  Multi-dimensional spectral gap optimization of order parameters (SGOOP) through conditional probability factorization , 2018, bioRxiv.

[11]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[12]  Catherine A. Sugar,et al.  Finding the Number of Clusters in a Dataset , 2003 .

[13]  Larry Griffin,et al.  Stochastic simulations reveal few green wave surfing populations among spring migrating herbivorous waterfowl , 2019, Nature Communications.

[14]  M. Kunitski,et al.  Double-slit photoelectron interference in strong-field ionization of the neon dimer , 2018, Nature Communications.

[15]  J. Preto,et al.  Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics. , 2014, Physical chemistry chemical physics : PCCP.

[16]  C. Clementi,et al.  Discovering mountain passes via torchlight: methods for the definition of reaction coordinates and pathways in complex macromolecular reactions. , 2013, Annual review of physical chemistry.

[17]  Giovanni Bussi,et al.  Enhanced Sampling in Molecular Dynamics Using Metadynamics, Replica-Exchange, and Temperature-Acceleration , 2013, Entropy.

[18]  Gregory R Bowman,et al.  FAST Conformational Searches by Balancing Exploration/Exploitation Trade-Offs. , 2015, Journal of chemical theory and computation.

[19]  Bernhardt L Trout,et al.  Extensions to the likelihood maximization approach for finding reaction coordinates. , 2007, The Journal of chemical physics.

[20]  M. Parrinello,et al.  Funnel metadynamics as accurate binding free-energy method , 2013, Proceedings of the National Academy of Sciences.

[21]  M. Parrinello,et al.  A time-independent free energy estimator for metadynamics. , 2015, The journal of physical chemistry. B.

[22]  Albert C. Pan,et al.  Quantitative Characterization of the Binding and Unbinding of Millimolar Drug Fragments with Molecular Dynamics Simulations. , 2017, Journal of chemical theory and computation.

[23]  M. Parrinello,et al.  Well-tempered metadynamics: a smoothly converging and tunable free-energy method. , 2008, Physical review letters.

[24]  Gerhard Stock,et al.  How complex is the dynamics of Peptide folding? , 2007, Physical review letters.

[25]  Michele Parrinello,et al.  Assessing the Reliability of the Dynamics Reconstructed from Metadynamics. , 2014, Journal of chemical theory and computation.

[26]  Massimiliano Bonomi,et al.  Efficient Sampling of High-Dimensional Free-Energy Landscapes with Parallel Bias Metadynamics. , 2015, Journal of chemical theory and computation.

[27]  A. Laio,et al.  A bias-exchange approach to protein folding. , 2007, The journal of physical chemistry. B.

[28]  Florian Sittel,et al.  Perspective: Identification of collective variables and metastable states of protein dynamics. , 2018, The Journal of chemical physics.

[29]  Alessandro Laio,et al.  Advillin folding takes place on a hypersurface of small dimensionality. , 2008, Physical review letters.

[30]  Aaron R Dinner,et al.  Automatic method for identifying reaction coordinates in complex systems. , 2005, The journal of physical chemistry. B.

[31]  Michele Parrinello,et al.  Enhancing Important Fluctuations: Rare Events and Metadynamics from a Conceptual Viewpoint. , 2016, Annual review of physical chemistry.

[32]  Axel van de Walle,et al.  A Review of Enhanced Sampling Approaches for Accelerated Molecular Dynamics , 2016 .

[33]  Michele Parrinello,et al.  A variational conformational dynamics approach to the selection of collective variables in metadynamics. , 2017, The Journal of chemical physics.

[34]  Massimiliano Bonomi,et al.  PLUMED 2: New feathers for an old bird , 2013, Comput. Phys. Commun..

[35]  Wei Chen,et al.  Molecular enhanced sampling with autoencoders: On‐the‐fly collective variable discovery and accelerated free energy landscape exploration , 2017, J. Comput. Chem..

[36]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[37]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[38]  E Weinan,et al.  Transition pathways in complex systems: Reaction coordinates, isocommittor surfaces, and transition tubes , 2005 .

[39]  Ioannis G Kevrekidis,et al.  Intrinsic map dynamics exploration for uncharted effective free-energy landscapes , 2016, Proceedings of the National Academy of Sciences.

[40]  Diwakar Shukla,et al.  Reinforcement Learning Based Adaptive Sampling: REAPing Rewards by Exploring Protein Conformational Landscapes. , 2017, The journal of physical chemistry. B.

[41]  Pratyush Tiwary,et al.  Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). , 2018, The Journal of chemical physics.