Accelerating AP3M-Based Computational Astrophysics Simulations with Reconfigurable Clusters

In this paper, we present a case study of using a reconfigurable computing cluster to accelerate AP3M-based computational astrophysics simulations. AP3M is an adaptive particle-particle, particle-mesh method. Many computational astrophysics simulations are based on this method. AP3M can dynamically and adaptively apply computational resources non-uniformly to emphasize regions of interest. Therefore, AP3M can be faster and more energy-efficient than the traditional P3M (particle-particle, particle-mesh) approach. However, the dynamic and pointer-based data structure used by AP3M makes it extremely difficult to accelerate with FPGAs. In this work, we use a custom data structure and hardware kernel to overcome these challenges. All CPU-based dynamic and pointer-based tasks are mapped to FPGAs. Our experiments show that a single FPGA outperforms a Xeon E5-2660 CPU server (8 cores) by from 21x to 23x depending on problem size and data distribution.

[1]  Makoto Taiji,et al.  MDGRAPE-4: a special-purpose computer system for molecular dynamics simulations , 2014, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[2]  R. Teyssier Cosmological hydrodynamics with adaptive mesh refinement - A new high resolution code called RAMSES , 2001, astro-ph/0111367.

[3]  Tianqi Wang,et al.  FP-AMR: A Reconfigurable Fabric Framework for Adaptive Mesh Refinement Applications , 2019, 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[4]  H. Couchman,et al.  Mesh-refined P3M - A fast adaptive N-body algorithm , 1991 .

[5]  Alexander Shirokov,et al.  GRACOS: Scalable and Load Balanced P3M Cosmological N-body Code , 2005 .