A Scalable FPGA Design for Cloud N-Body Simulation

The N-Body simulation process describes the evolution of a system of forces composed of N bodies, which may represent celestial objects, molecules, and so on. The most accurate algorithm for N-Body simulation, the All-Pairs method, is particularly compute intensive and software implementations on CPUs are inefficient in terms of performance and power consumption. An implementation on a hardware accelerator, such as an FPGA, would benefits in both these terms, exploiting a parallel execution at a relative low power profile. Moreover, it would also benefit faster methods with lower computational complexity, since many of them rely on the All-Pairs approach to approximate the calculation of forces. This work proposes a highly scalable, power efficient and high performance hardware architecture for the N-Body All-Pairs simulation problem. Our final implementation is able to scale up to systems with an arbitrary number of bodies thanks to a tiling approach that allows performance in the order of 13,441 MPairs/s, outperforming state of the art implementations on FPGA in terms of both pure performance, as well as performance per watt ratio. Finally, our design results to be more power efficient than Grape-8 ASIC.

[1]  T. Ebisuzaki,et al.  Molecular Dynamics Machine: Special-Purpose Computer for Molecular Dynamics Simulations , 1999 .

[2]  Makoto Taiji,et al.  A Comparative Study on ASIC, FPGAs, GPUs and General Purpose Processors in the O(N^2) Gravitational N-body Simulation , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.

[3]  Injong Rhee,et al.  N-body: A social mobility model with support for larger populations , 2015, Ad Hoc Networks.

[4]  Alan H. Karp Speeding up N-body Calculations on Machines without Hardware Square Root , 1992, Sci. Program..

[5]  Atsushi Kawai,et al.  $158/GFLOPS astrophysical N-body simulation with reconfigurable add-in card and hierarchical tree algorithm , 2006, SC.

[6]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[7]  Xi Jin,et al.  RP-Ring: A Heterogeneous Multi-FPGA Accelerating Solution for N-Body Simulations , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[8]  L. Greengard,et al.  A new version of the Fast Multipole Method for the Laplace equation in three dimensions , 1997, Acta Numerica.

[9]  Kentaro Sano,et al.  FPGA-based Stream Computing for High-Performance N-Body Simulation using Floating-Point DSP Blocks , 2017, HEART.

[10]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[11]  Jeroen Bédorf,et al.  A sparse octree gravitational N-body code that runs entirely on the GPU processor , 2011, J. Comput. Phys..

[12]  Marco D. Santambrogio,et al.  The Role of CAD Frameworks in Heterogeneous FPGA-Based Cloud Systems , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[13]  Makoto Taiji,et al.  Scientific simulations with special purpose computers - the GRAPE systems , 1998 .

[14]  Guohong Xu A new parallel N body gravity solver: TPM , 1994, astro-ph/9409021.

[15]  Xi Jin,et al.  RP-Ring: A Heterogeneous Multi-FPGA Accelerator , 2018, Int. J. Reconfigurable Comput..

[16]  Xi Jin,et al.  An Accelerating Solution for N-Body MOND Simulation with FPGA-SoC , 2016, Int. J. Reconfigurable Comput..

[17]  Marco D. Santambrogio,et al.  A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).