OpenFPM: A scalable open framework for particle and particle-mesh codes on parallel computers

Abstract Scalable and efficient numerical simulations continue to gain importance, as computation is firmly established as the third pillar of discovery, alongside theory and experiment. Meanwhile, the performance of computing hardware grows through increasingly heterogeneous parallelism, enabling simulations of ever more complex models. However, efficiently implementing scalable codes on heterogeneous, distributed hardware systems becomes the bottleneck. This bottleneck can be alleviated by intermediate software layers that provide higher-level abstractions closer to the problem domain, reducing development times and allowing computational scientists to focus. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of particles-only and hybrid particle-mesh simulations of both discrete and continuous models, as well as non-simulation codes. This infrastructure is complemented with frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed-Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), Vortex Methods, stencil codes (finite differences), and high-dimensional Monte Carlo sampling (CMA-ES), comparing it to the current state of the art and to existing software frameworks. Program summary Program Title: OpenFPM Program Files doi: http://dx.doi.org/10.17632/4yrp8nbm7c.1 Licensing provisions: GPLv3 Programming language: C++ Nature of problem: Writing numerical simulation programs that use meshes, particles, or any combination of the two typically requires long development times, in particular if the code is to scale efficiently on parallel distributed-memory computers. The long development times incur high financial and project-time costs and often lead to sub-optimal program performance as shortcuts are taken. Yet, a large portion of the functionality is common across programs and could be automated or provided as reusable software components, leading to large savings in project costs and potentially improved software performance. Solution method: OpenFPM provides a scalable, highly efficient software platform for numerical simulations using meshes, particles, or any combination of the two on parallel computers. It is based on a well-known set of abstract data types and operators that suffice to express any such simulation, regardless of the application domain. OpenFPM provides reusable, tested, and internally parallelized software components that reduce development times and make parallel computing accessible to computational scientists without extensive knowledge in parallel programming. Additional comments including restrictions and unusual features: OpenFPM is a software library based on which users can implement their simulation codes at a fraction of the development cost. All parallelization and memory handling is transparently done by the library. As its main innovation, OpenFPM makes use of C++ Template Meta Programming in order to enable simulations in arbitrary-dimensional spaces, distribution of arbitrary user-defined C++ objects, and compile-time code optimization and targeting for specific hardware platforms. OpenFPM-based simulations can directly output VTK files for visualization of results and HDF5 files for data archiving.

[1]  Torsten Hoefler,et al.  Scalable communication protocols for dynamic sparse data exchange , 2010, PPoPP '10.

[2]  L. Verlet Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .

[3]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[4]  Ivo F. Sbalzarini,et al.  A domain-specific programming language for particle simulations on distributed-memory parallel computers , 2013 .

[5]  Ivo F. Sbalzarini,et al.  Discretization correction of general integral PSE Operators for particle methods , 2010, J. Comput. Phys..

[6]  Tim Colonius,et al.  A general deterministic treatment of derivatives in particle methods , 2002 .

[7]  William J. Schroeder,et al.  The Visualization Toolkit , 2005, The Visualization Handbook.

[8]  Eric Darve,et al.  Liszt: A domain specific language for building portable mesh-based PDE solvers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  Ivo F. Sbalzarini,et al.  A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images , 2016, PloS one.

[10]  K. Wilson Confinement of Quarks , 1974 .

[11]  Tarek A. El-Ghazawi,et al.  UPC: unified parallel C , 2006, SC.

[12]  Ümit V. Çatalyürek,et al.  Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[13]  R. Adams Proceedings , 1947 .

[14]  Aleksandar Jemcov,et al.  OpenFOAM: A C++ Library for Complex Physics Simulations , 2007 .

[15]  Stephen K. Scott,et al.  Autocatalytic reactions in the isothermal, continuous stirred tank reactor: Oscillations and instabilities in the system A + 2B → 3B; B → C , 1984 .

[16]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[17]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .

[18]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[19]  Christophe Picard,et al.  HIGH ORDER SEMI-LAGRANGIAN PARTICLE METHODS FOR TRANSPORT EQUATIONS: NUMERICAL ANALYSIS AND IMPLEMENTATION ISSUES , 2014 .

[20]  Nikolaus A. Adams,et al.  A multi-phase SPH method for macroscopic and mesoscopic flows , 2006, J. Comput. Phys..

[21]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[22]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[23]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[24]  Ivo F. Sbalzarini,et al.  A Domain-Specific Language and Editor for Parallel Particle Methods , 2018, ACM Trans. Math. Softw..

[25]  J. Monaghan,et al.  Extrapolating B splines for interpolation , 1985 .

[26]  Alan Edelman,et al.  Julia: A Fast Dynamic Language for Technical Computing , 2012, ArXiv.

[27]  Stephen M. Longshaw,et al.  DualSPHysics: Open-source parallel CFD solver based on Smoothed Particle Hydrodynamics (SPH) , 2015, Comput. Phys. Commun..

[28]  Mikito Furuichi,et al.  Iterative load-balancing method with multigrid level relaxation for particle simulation with short-range interactions , 2017, Comput. Phys. Commun..

[29]  Petros Koumoutsakos,et al.  Vortex Methods: Theory and Practice , 2000 .

[30]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[31]  Joel H. Saltz,et al.  Adaptive runtime support for direct simulation Monte Carlo methods on distributed memory architectures , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[32]  G. Grest,et al.  Granular flow down an inclined plane: Bagnold scaling and rheology. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Ericka Stricklin-Parker,et al.  Ann , 2005 .

[34]  Ivo F. Sbalzarini,et al.  Discrete Region Competition for Unknown Numbers of Connected Regions , 2012, IEEE Transactions on Image Processing.

[35]  Ivo F. Sbalzarini,et al.  Abstractions and Middleware for Petascale Computing and Beyond , 2010, Int. J. Distributed Syst. Technol..

[36]  Dietmar Fey,et al.  LibGeoDecomp: A Grid-Enabled Library for Geometric Decomposition Codes , 2008, PVM/MPI.

[37]  Ivo F. Sbalzarini,et al.  Large‐scale parallel discrete element simulations of granular flow , 2009 .

[38]  Ivo F. Sbalzarini,et al.  A Language and Development Environment for Parallel Particle Methods , 2017 .

[39]  Franz Rothlauf Proceedings of the 11th Annual conference on Genetic and evolutionary computation , 2009, GECCO 2009.

[40]  S. Papson “Model” , 1981 .

[41]  Ivo F. Sbalzarini,et al.  A self-organizing Lagrangian particle method for adaptive-resolution advection-diffusion simulations , 2012, J. Comput. Phys..

[42]  Michael Alexander,et al.  Proceedings of the 48th International Conference on Parallel Processing: Workshops , 2012 .

[43]  J. Monaghan Smoothed particle hydrodynamics , 2005 .

[44]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[45]  Vipin Kumar,et al.  A Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering , 1998, J. Parallel Distributed Comput..

[46]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[47]  B. M. Fulk MATH , 1992 .

[48]  Ivo F. Sbalzarini,et al.  A portable OpenCL implementation of generic particle-mesh and mesh-particle interpolation in 2D and 3D , 2013, Parallel Comput..

[49]  P. Gray,et al.  Sustained oscillations and other exotic patterns of behavior in isothermal reactions , 1985 .

[50]  Nicholas Carriero,et al.  Linda in context , 1989, CACM.

[51]  Jean Roman,et al.  SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.

[52]  Nikolaus A. Adams,et al.  A generalized wall boundary condition for smoothed particle hydrodynamics , 2012, J. Comput. Phys..

[53]  P. Degond,et al.  The weighted particle method for convection-diffusion equations. II. The anisotropic case , 1989 .

[54]  Ivo F. Sbalzarini,et al.  Fast neighbor lists for adaptive-resolution particle simulations , 2012, Comput. Phys. Commun..

[55]  Julie A. Theriot,et al.  Principles of locomotion for simple-shaped cells , 1993, Nature.

[56]  Michael Bergdorf,et al.  Direct numerical simulations of vortex rings at ReΓ = 7500 , 2007, Journal of Fluid Mechanics.

[57]  Claudio Bonati,et al.  QCD simulations with staggered fermions on GPUs , 2011, Comput. Phys. Commun..

[58]  Karl Fürlinger,et al.  DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[59]  Ivo F. Sbalzarini,et al.  Toward an Object-Oriented Core of the PPM Library , 2010 .

[60]  Tamara G. Kolda,et al.  An overview of the Trilinos project , 2005, TOMS.

[61]  Thomas L. Sterling,et al.  ParalleX An Advanced Parallel Execution Model for Scaling-Impaired Applications , 2009, 2009 International Conference on Parallel Processing Workshops.

[62]  Andreas Dedner,et al.  The Distributed and Unified Numerics Environment (DUNE) , 2006 .

[63]  Stephen K. Scott,et al.  Autocatalytic reactions in the isothermal, continuous stirred tank reactor: Isolas and other forms of multistability , 1983 .

[64]  Junichiro Makino,et al.  Implementation and performance of FDPS: a framework for developing parallel particle simulation codes , 2016, 1601.03138.

[65]  A. Turing The chemical basis of morphogenesis , 1952, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences.

[66]  P. Degond,et al.  The weighted particle method for convection-diffusion equations , 1989 .

[67]  Christian L. Müller,et al.  Global Characterization of the CEC 2005 Fitness Landscapes Using Fitness-Distance Analysis , 2011, EvoApplications.

[68]  Anders Logg,et al.  DOLFIN: Automated finite element computing , 2010, TOMS.

[69]  William Gropp,et al.  MPICH2: A New Start for MPI Implementations , 2002, PVM/MPI.

[70]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[71]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[72]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[73]  Laxmikant V. Kalé,et al.  CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.

[74]  Christian L. Müller,et al.  Particle Swarm CMA Evolution Strategy for the optimization of multi-funnel landscapes , 2009, 2009 IEEE Congress on Evolutionary Computation.

[75]  Vivek Sarkar,et al.  Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , 2017, PPOPP.

[76]  Christian L. Müller,et al.  pCMALib: a parallel fortran 90 library for the evolution strategy with covariance matrix adaptation , 2009, GECCO '09.

[77]  Ivo F. Sbalzarini,et al.  PPM - A highly efficient parallel particle-mesh library for the simulation of continuum systems , 2006, J. Comput. Phys..

[78]  Peter M. A. Sloot,et al.  Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking , 1995 .

[79]  David A. Padua,et al.  FALCON: A MATLAB Interactive Restructuring Compiler , 1995, LCPC.

[80]  J. E. Pearson Complex Patterns in a Simple System , 1993, Science.

[81]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[82]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[83]  Christian L. Müller,et al.  Gaussian Adaptation as a unifying framework for continuous black-box optimization and adaptive Monte Carlo sampling , 2010, IEEE Congress on Evolutionary Computation.

[84]  Utkarsh Ayachit,et al.  The ParaView Guide: A Parallel Visualization Application , 2015 .

[85]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.