AutoPas in ls1 mardyn: Massively parallel particle simulations with node-level auto-tuning

Due to computational cost, simulation software is confronted with the need to always use optimal building blocks — data structures, solver algorithms, parallelization schemes, and so forth — in terms of efficiency, while it typically needs to support a variety of hardware architectures. AutoPas implements the computationally most expensive molecular dynamics (MD) steps (e.g., force calculation) and chooses on-the-fly, i.e., at run time, the optimal combination of the previously mentioned building blocks. We detail decisions made in AutoPas to enable the interplay with MPI-parallel simulations and, to our knowledge, showcase the first MPI-parallel MD simulations that use dynamic tuning. We discuss the benefits of this approach for three simulation scenarios from process engineering, in which we obtain performance improvements of up to 50%, compared to the baseline performance of the highly optimized ls1 mardyn software.

[1]  Ivo F. Sbalzarini,et al.  OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers , 2018, Comput. Phys. Commun..

[2]  Grigori Fursin Collective Tuning Initiative: automating and accelerating development and optimization of computing systems , 2009 .

[3]  Hans-Joachim Bungartz,et al.  Load Balancing for Molecular Dynamics Simulations on Heterogeneous Architectures , 2016, 2016 IEEE 23rd International Conference on High Performance Computing (HiPC).

[4]  Ananta Tiwari,et al.  Online Adaptive Code Generation and Tuning , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[5]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[6]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[7]  P. Hopkins A new class of accurate, mesh-free hydrodynamic simulation methods , 2014, 1409.7395.

[8]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .

[9]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[10]  Carsten Kutzner,et al.  Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS , 2015, EASC.

[11]  Pablo G. Debenedetti,et al.  On the use of the Verlet neighbor list in molecular dynamics , 1990 .

[12]  L. Verlet Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .

[13]  Peng Wang,et al.  Implementing molecular dynamics on hybrid high performance computers - short range forces , 2011, Comput. Phys. Commun..

[14]  王东东,et al.  Computer Methods in Applied Mechanics and Engineering , 2004 .

[15]  Junichiro Makino,et al.  Implementation and performance of FDPS: a framework for developing parallel particle simulation codes , 2016, 1601.03138.

[16]  Bruce Hendrickson,et al.  Dynamic load balancing in computational mechanics , 2000 .

[17]  Stefan Leye,et al.  Flexible experimentation in the modeling and simulation framework JAMES II - implications for computational systems biology , 2010, Briefings Bioinform..

[18]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[19]  C. Holm,et al.  ESPResSo 4.0 – an extensible software package for simulating soft matter systems , 2018, The European Physical Journal Special Topics.

[20]  Roland Ewald,et al.  Automatic Algorithm Selection for Complex Simulation Problems , 2011, Vieweg+Teubner Verlag.

[21]  Erwin Laure,et al.  Solving Software Challenges for Exascale , 2014, Lecture Notes in Computer Science.

[22]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[23]  V. Springel The Cosmological simulation code GADGET-2 , 2005, astro-ph/0505010.

[24]  Michael M. Resch,et al.  TweTriS: Twenty trillion-atom simulation , 2019, Int. J. High Perform. Comput. Appl..

[25]  Pedro Gonnet Pairwise verlet lists: Combining cell lists and verlet lists to improve memory locality and parallelism , 2012, J. Comput. Chem..

[26]  Steven J. Plimpton,et al.  Optimizing legacy molecular dynamics software with directive-based offload , 2015, Comput. Phys. Commun..

[27]  Stephen M. Longshaw,et al.  DualSPHysics: Open-source parallel CFD solver based on Smoothed Particle Hydrodynamics (SPH) , 2015, Comput. Phys. Commun..

[28]  Junichiro Makino,et al.  A Fast Parallel Treecode with GRAPE , 2004 .

[29]  Stefan Becker,et al.  ls1 mardyn: The massively parallel molecular dynamics code for large systems , 2014, Journal of chemical theory and computation.

[30]  Robert E. Rudd,et al.  COARSE-GRAINED MOLECULAR DYNAMICS AND THE ATOMIC LIMIT OF FINITE ELEMENTS , 1998 .

[31]  Peter M. Kasson,et al.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit , 2013, Bioinform..

[32]  O'Connell,et al.  Molecular dynamics-continuum hybrid computations: A tool for studying complex fluid flows. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[33]  Andrey Semin,et al.  Increasing Molecular Dynamics Simulation Rates with an 8-Fold Increase in Electrical Power Efficiency , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[34]  P. Murdin MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY , 2005 .

[35]  Hans-Joachim Bungartz,et al.  MaMiCo: Software design for parallel molecular-continuum flow simulations , 2016, Comput. Phys. Commun..

[36]  Hans-Joachim Bungartz,et al.  AutoPas: Auto-Tuning for Particle Simulations , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).