Implementing Adaptive Ensemble Biomolecular Applications at Scale

Many scientific problems require multiple distinct computational tasks to be executed in order to achieve a desired solution. Novel approaches focus on leveraging intermediate data to adapt the application to study larger problems, longer time scales and to engineer better fidelity in the modeling of complex phenomena. In this paper, we describe types of application adaptivity in such applications, develop abstractions to specify application adaptivity, and challenges in implementing them in software tools. We describe the design and enhancement of Ensemble Toolkit to support adaptivity, characterize the adaptivity overhead, validate the implementation of two exemplar molecular dynamics algorithms: expanded ensemble and markov state modeling, and analyze the results of running the expanded ensemble algorithm at production scale. We discuss novel computational capabilities enabled by abstractions for adaptive ensemble applications and the scientific advantages arising thereof.

[1]  Daniel R. Roe,et al.  The Impact of Heterogeneous Computing on Workflows for Biomolecular Simulation and Analysis , 2015, Computing in Science & Engineering.

[2]  A. Laio,et al.  Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics. , 2006, The journal of physical chemistry. B.

[3]  Grant M. Rotskoff,et al.  Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform. , 2015, Journal of chemical theory and computation.

[4]  M. Doxastakis,et al.  Accelerating flat-histogram methods for potential of mean force calculations. , 2009, The Journal of chemical physics.

[5]  Shantenu Jha,et al.  Characterization of the three-dimensional free energy manifold for the uracil ribonucleoside from asynchronous replica exchange simulations. , 2015, Journal of chemical theory and computation.

[6]  Peter M. Kasson,et al.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit , 2013, Bioinform..

[7]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[8]  Sanghyun Park,et al.  Comparison of the serial and parallel algorithms of generalized ensemble simulations: an analytical approach. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Vijay S Pande,et al.  Enhanced modeling via network theory: Adaptive sampling of Markov state models. , 2010, Journal of chemical theory and computation.

[10]  Xuhui Huang,et al.  Using generalized ensemble simulations and Markov state models to identify conformational states. , 2009, Methods.

[11]  Natalja Rakowsky,et al.  amatos: Parallel adaptive mesh generator for atmospheric and oceanic simulation , 2005 .

[12]  Matteo Turilli,et al.  Enabling Trade-offs Between Accuracy and Computational Cost: Adaptive Algorithms to Reduce Time to Clinical Insight , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[13]  Zhiling Lan,et al.  Dynamic load balancing for structured adaptive mesh refinement applications , 2001, International Conference on Parallel Processing, 2001..

[14]  Y. Sugita,et al.  Replica-exchange multicanonical and multicanonical replica-exchange Monte Carlo simulations of peptides. I. Formulation and benchmark test , 2003 .

[15]  Andrew J Ballard,et al.  Replica exchange with nonequilibrium switches , 2009, Proceedings of the National Academy of Sciences.

[16]  Rui Pinho,et al.  An adaptive capacity spectrum method for assessment of bridges subjected to earthquake action , 2007 .

[17]  Michael Stonebraker,et al.  Too much middleware , 2002, SGMD.

[18]  Shantenu Jha,et al.  Using Pilot Systems to Execute Many Task Workloads on Supercomputers , 2015, JSSPP.

[19]  Paulin Coulibaly,et al.  Nonstationary hydrological time series forecasting using nonlinear dynamic methods , 2005 .

[20]  Yukito Iba EXTENDED ENSEMBLE MONTE CARLO , 2001 .

[21]  Michael R. Shirts,et al.  Replica exchange and expanded ensemble simulations as Gibbs sampling: simple improvements for enhanced mixing. , 2011, The Journal of chemical physics.

[22]  Lei Huang,et al.  Generalized scalable multiple copy algorithms for molecular dynamics simulations in NAMD , 2014, Comput. Phys. Commun..

[23]  Massimiliano Bonomi,et al.  Metadynamics , 2019, ioChem-BD Computational Chemistry Datasets.

[24]  A. Lyubartsev,et al.  New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles , 1992 .

[25]  H. Atwater,et al.  Plasmonics for improved photovoltaic devices. , 2010, Nature materials.

[26]  J. Ferkinghoff-Borg,et al.  Optimized Monte Carlo analysis for generalized ensembles , 2002 .

[27]  Anne H. H. Ngu,et al.  Enabling ScientificWorkflow Reuse through Structured Composition of Dataflow and Control-Flow , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[28]  Massimiliano Bonomi,et al.  PLUMED 2: New feathers for an old bird , 2013, Comput. Phys. Commun..

[29]  Seyed Masoud Sadjadi,et al.  Composing adaptive software , 2004, Computer.

[30]  Ulrich H E Hansmann,et al.  Generalized ensemble and tempering simulations: a unified view. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Volodymyr Babin,et al.  Adaptively biased molecular dynamics for free energy calculations. , 2007, The Journal of chemical physics.

[32]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..

[33]  Shantenu Jha,et al.  Ensemble Toolkit: Scalable and Flexible Execution of Ensembles of Tasks , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[34]  F. Escobedo,et al.  Expanded ensemble and replica exchange methods for simulation of protein-like systems , 2003 .

[35]  van der Wmp Wil Aalst,et al.  Dealing with workflow change: identification of issues and solutions , 2000 .

[36]  Feng Liu,et al.  Integrating Abstractions to Enhance the Execution of Distributed Applications , 2015, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[37]  Marta Mattoso,et al.  Dynamic steering of HPC scientific workflows: A survey , 2015, Future Gener. Comput. Syst..

[38]  I. Kevrekidis,et al.  Coarse molecular dynamics of a peptide fragment: Free energy, kinetics, and long-time dynamics computations , 2002, physics/0212108.

[39]  B. Berne,et al.  Spectral gap optimization of order parameters for sampling complex molecular systems , 2015, Proceedings of the National Academy of Sciences.

[40]  Massimiliano Bonomi,et al.  PLUMED: A portable plugin for free-energy calculations with molecular dynamics , 2009, Comput. Phys. Commun..

[41]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[42]  M. Troyer,et al.  Optimized parallel tempering simulations of proteins. , 2006, The Journal of chemical physics.

[43]  Eric Vanden-Eijnden,et al.  Comparison between Mean Forces and Swarms-of-Trajectories String Methods. , 2014, Journal of chemical theory and computation.

[44]  Klaus Schulten,et al.  Multiple-Replica Strategies for Free-Energy Calculations in NAMD: Multiple-Walker Adaptive Biasing Force and Walker Selection Rules. , 2014, Journal of chemical theory and computation.

[45]  Mitsuhisa Sato,et al.  OmniRPC: a grid RPC system for parallel programming in cluster and grid environment , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[46]  V. Pande,et al.  Choosing weights for simulated tempering. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[47]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  I. Tavernelli,et al.  A Novel Hamiltonian Replica Exchange MD Protocol to Enhance Protein Conformational Space Sampling. , 2006, Journal of chemical theory and computation.

[49]  Satoshi Matsuoka,et al.  Ninf-G: A Reference Implementation of RPC-based Programming Middleware for Grid Computing , 2003, Journal of Grid Computing.

[50]  Riccardo Chelli,et al.  Serial Generalized Ensemble Simulations of Biomolecules with Self-Consistent Determination of Weights. , 2012, Journal of chemical theory and computation.

[51]  Pascal Bonnet,et al.  Exploring Protein Kinase Conformation Using Swarm-Enhanced Sampling Molecular Dynamics , 2014, J. Chem. Inf. Model..

[52]  Eric Vanden-Eijnden,et al.  Markovian milestoning with Voronoi tessellations. , 2009, The Journal of chemical physics.

[53]  Giovanni Bussi,et al.  Enhanced Sampling in Molecular Dynamics Using Metadynamics, Replica-Exchange, and Temperature-Acceleration , 2013, Entropy.

[54]  Yuko Okamoto,et al.  Replica-exchange extensions of simulated tempering method. , 2004, The Journal of chemical physics.

[55]  Carole A. Goble,et al.  Taverna, Reloaded , 2010, SSDBM.

[56]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[57]  Nicholas B Tito,et al.  Glass transition of polymers in bulk, confined geometries, and near interfaces , 2017, Reports on progress in physics. Physical Society.

[58]  U. Hansmann Parallel tempering algorithm for conformational studies of biological molecules , 1997, physics/9710041.

[59]  V. Pande,et al.  Markov State Models: From an Art to a Science. , 2018, Journal of the American Chemical Society.

[60]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[61]  Matteo Turilli,et al.  Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications , 2017, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[62]  Yuko Okamoto,et al.  Generalized-ensemble algorithms: enhanced sampling techniques for Monte Carlo and molecular dynamics simulations. , 2003, Journal of molecular graphics & modelling.

[63]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[64]  Vijay S. Pande,et al.  Everything you wanted to know about Markov State Models but were afraid to ask. , 2010, Methods.

[65]  R. Elber,et al.  Computing time scales from reaction coordinates by milestoning. , 2004, The Journal of chemical physics.

[66]  Makoto Taiji,et al.  Fast and accurate molecular dynamics simulation of a protein using a special‐purpose computer , 1997 .

[67]  John D. Chodera,et al.  Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations , 2006, Multiscale Model. Simul..

[68]  Kai Wang,et al.  Identifying ligand binding sites and poses using GPU-accelerated Hamiltonian replica exchange molecular dynamics , 2013, Journal of Computer-Aided Molecular Design.

[69]  Chris Neale,et al.  Simulated Tempering Distributed Replica Sampling, Virtual Replica Exchange, and Other Generalized-Ensemble Methods for Conformational Sampling. , 2009, Journal of chemical theory and computation.

[70]  V. Pande,et al.  Error analysis and efficient sampling in Markovian state models for molecular dynamics. , 2005, The Journal of chemical physics.

[71]  Ewa Deelman,et al.  Failure prediction and localization in large scientific workflows , 2011, WORKS '11.

[72]  Michael R. Shirts,et al.  Statistically optimal analysis of samples from multiple equilibrium states. , 2008, The Journal of chemical physics.

[73]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[74]  V. Pande,et al.  Calculation of the distribution of eigenvalues and eigenvectors in Markovian state models for molecular dynamics. , 2007, The Journal of chemical physics.

[75]  Grubmüller,et al.  Predicting slow structural transitions in macromolecular systems: Conformational flooding. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[76]  Michael Levitt,et al.  Generalized ensemble methods for de novo structure prediction , 2009, Proceedings of the National Academy of Sciences.

[77]  Kyle A. Gallivan,et al.  The gSOAP Toolkit for Web Services and Peer-to-Peer Computing Networks , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).