Enabling particle applications for exascale computing platforms

The Exascale Computing Project (ECP) is invested in co-design to assure that key applications are ready for exascale computing. Within ECP, the Co-design Center for Particle Applications (CoPA) is addressing challenges faced by particle-based applications across four “sub-motifs”: short-range particle–particle interactions (e.g., those which often dominate molecular dynamics (MD) and smoothed particle hydrodynamics (SPH) methods), long-range particle–particle interactions (e.g., electrostatic MD and gravitational N-body), particle-in-cell (PIC) methods, and linear-scaling electronic structure and quantum molecular dynamics (QMD) algorithms. Our crosscutting co-designed technologies fall into two categories: proxy applications (or “apps”) and libraries. Proxy apps are vehicles used to evaluate the viability of incorporating various types of algorithms, data structures, and architecture-specific optimizations and the associated trade-offs; examples include ExaMiniMD, CabanaMD, CabanaPIC, and ExaSP2. Libraries are modular instantiations that multiple applications can utilize or be built upon; CoPA has developed the Cabana particle library, PROGRESS/BML libraries for QMD, and the SWFFT and fftMPI parallel FFT libraries. Success is measured by identifiable “lessons learned” that are translated either directly into parent production application codes or into libraries, with demonstrated performance and/or productivity improvement. The libraries and their use in CoPA’s ECP application partner codes are also addressed.

[1]  A. Kashlinsky,et al.  Large-scale structure in the Universe , 1991, Nature.

[2]  Luis Chacón,et al.  A multi-dimensional, energy- and charge-conserving, nonlinearly implicit, electromagnetic Vlasov-Darwin particle-in-cell algorithm , 2015, Comput. Phys. Commun..

[3]  Emanuel H. Rubensson,et al.  Linear Scaling Pseudo Fermi-Operator Expansion for Fractional Occupation. , 2018, Journal of chemical theory and computation.

[4]  Michael Griebel,et al.  A combination technique for the solution of sparse grid problems , 1990, Forschungsberichte, TU Munich.

[5]  Luis Chacón,et al.  An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm , 2011, J. Comput. Phys..

[6]  Donald G. Truhlar,et al.  Ab Initio Molecular Dynamics: Basic Theory and Advanced Methods , 2010 .

[7]  Katherine Yelick,et al.  Exascale applications: skin in the game , 2020, Philosophical Transactions of the Royal Society A.

[8]  Q. Cui,et al.  Density functional tight binding: values of semi-empirical methods in an ab initio era. , 2014, Physical chemistry chemical physics : PCCP.

[9]  Takahito Nakajima,et al.  Massively parallel sparse matrix function calculations with NTPoly , 2017, Comput. Phys. Commun..

[10]  Christian Trott,et al.  Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials , 2014, J. Comput. Phys..

[11]  R J Maurer,et al.  DFTB+, a software package for efficient approximate density functional theory based atomistic simulations. , 2020, The Journal of chemical physics.

[12]  W. Zhang,et al.  Warp-X: A new exascale computing platform for beam–plasma simulations , 2017, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment.

[13]  Daniel Sunderland,et al.  Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..

[14]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[15]  Michael E. Wall,et al.  Recursive Factorization of the Inverse Overlap Matrix in Linear-Scaling Quantum Molecular Dynamics Simulations. , 2016, Journal of chemical theory and computation.

[16]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[17]  Guirong Liu,et al.  Smoothed Particle Hydrodynamics: A Meshfree Particle Method , 2003 .

[18]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[19]  Brian J. Albright,et al.  A semi-implicit, energy- and charge-conserving particle-in-cell algorithm for the relativistic Vlasov-Maxwell equations , 2020, J. Comput. Phys..

[20]  Jean M. Sexton,et al.  Nyx: A MASSIVELY PARALLEL AMR CODE FOR COMPUTATIONAL COSMOLOGY , 2013, J. Open Source Softw..

[21]  Stanimire Tomov,et al.  Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation , 2019, 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI).

[22]  Christian F. A. Negre,et al.  The basic matrix library (BML) for quantum chemistry , 2018, The Journal of Supercomputing.

[23]  J. U. Brackbill,et al.  Accurate Numerical Solution of Charged Particle Motion in a Magnetic Field , 1995 .

[24]  Vivek Sarin,et al.  Domain Decomposition , 2011, Encyclopedia of Parallel Computing.

[25]  M. Tuszewski,et al.  Field reversed configurations , 1988 .

[26]  J. Michael Owen,et al.  CRKSPH - A Conservative Reproducing Kernel Smoothed Particle Hydrodynamics Scheme , 2016, J. Comput. Phys..

[27]  J. Finn,et al.  Three-dimensional kinematic reconnection in the presence of field nulls and closed field lines , 1990 .

[28]  C. Birdsall,et al.  Numerical error in electron orbits with large Ω ce D t , 1991 .

[29]  Samuel Temple Reeve,et al.  Implementing a neural network interatomic model with performance portability for emerging exascale architectures , 2020, Comput. Phys. Commun..

[30]  Stan Moore,et al.  Rapid Exploration of Optimization Strategies on Advanced Architectures using TestSNAP and LAMMPS , 2020, ArXiv.

[31]  A M N Niklasson,et al.  Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics. , 2015, Journal of chemical theory and computation.

[32]  Patrick H. Worley,et al.  A fast low-to-high confinement mode bifurcation dynamics in the boundary-plasma gyrokinetic code XGC1 , 2018 .

[33]  T. Darden,et al.  A smooth particle mesh Ewald method , 1995 .

[34]  Hal Finkel,et al.  HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures , 2014, 1410.2805.

[35]  Lee F Ricketson,et al.  Sparse grid techniques for particle-in-cell schemes , 2016, 1607.06516.

[36]  Luis Chacón,et al.  An energy- and charge-conserving, implicit, electrostatic particle-in-cell algorithm , 2011, J. Comput. Phys..

[37]  W. D. Krauss,et al.  Modeling Dilute Solutions Using First-Principles Molecular Dynamics: Computing more than a Million Atoms with over a Million Cores , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[38]  R. Davidson,et al.  Physics of Nonneutral Plasmas , 1991 .

[39]  Klaus Schulten,et al.  NAMD goes quantum: An integrative suite for QM/MM simulations , 2018, Nature Methods.

[40]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[41]  Michael E. Wall,et al.  Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665) , 2021, ArXiv.

[42]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[43]  D. Welch,et al.  A Fast Implicit Algorithm for Highly Magnetized Charged Particle Motion , 2014 .

[44]  Hal Finkel,et al.  The Universe at extreme scale: Multi-petaflop sky simulation on the BG/Q , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[45]  William Daughton,et al.  Advances in petascale kinetic plasma simulation with VPIC and Roadrunner , 2009 .

[46]  Danny Perez,et al.  Long-Time Dynamics through Parallel Trajectory Splicing. , 2016, Journal of chemical theory and computation.

[47]  Jean-Luc Vay,et al.  PPPS-2013: Topic 1.2: A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas , 2013, 2013 Abstracts IEEE International Conference on Plasma Science (ICOPS).

[48]  Timothy C. Germann,et al.  Exascale Co-Design Center for Materials in Extreme Environments (ExMatEx) Annual Report - Year 2 , 2013 .

[49]  Jeremiah Brackbill,et al.  Simulation of Low-Frequency, Electromagnetic Phenomena in Plasmas , 1985 .

[50]  Emanuel H. Rubensson,et al.  Graph-based linear scaling electronic structure theory. , 2016, The Journal of chemical physics.

[51]  Luis Chacón,et al.  A fully implicit, conservative, non-linear, electromagnetic hybrid particle-ion/fluid-electron algorithm , 2018, J. Comput. Phys..

[52]  L. Ricketson,et al.  An energy-conserving and asymptotic-preserving charged-particle orbit implicit time integrator for arbitrary electromagnetic fields , 2019, J. Comput. Phys..

[53]  Anders M. N. Niklasson,et al.  Trace resetting density matrix purification in O(N) self-consistent-field theory , 2003 .

[54]  J. Hunt,et al.  Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , 2015 .

[55]  Jack J. Dongarra,et al.  Accelerating Numerical Dense Linear Algebra Calculations with GPUs , 2014, Numerical Computations with GPUs.