Accelerating astrophysical particle simulations with programmable hardware (FPGA and GPU)

AbstractIn a previous paper we have shown that direct gravitational N-body simulations in astrophysics scale very well for moderately parallel supercomputers (order 10–100 nodes). The best balance between computation and communication is reached if the nodes are accelerated by special purpose hardware; in this paper we describe the implementation of particle based astrophysical simulation codes on new types of accelerator hardware (field programmable gate arrays, FPGA, and graphical processing units, GPU). In addition to direct gravitational N-body simulations we also use the algorithmically similar “smoothed particle hydrodynamics” method as test application; the algorithms are used for astrophysical problems as e.g. evolution of galactic nuclei with central black holes and gravitational wave generation, and star formation in galaxies and galactic nuclei. We present the code performance on a single node using different kinds of special hardware (traditional GRAPE, FPGA, and GPU) and some implementation aspects (e.g. accuracy). The results show that GPU hardware for real application codes is as fast as GRAPE, but for an order of magnitude lower price, and that FPGA is useful for acceleration of complex sequences of operations (like SPH). We discuss future prospects and new cluster computers built with new generations of FPGA and GPU cards.

[1]  J. Makino,et al.  Sixth- and eighth-order Hermite integrator for N-body simulations , 2007, 0708.0738.

[2]  Rainer Spurzem,et al.  BINARY BLACK HOLE MERGER IN GALACTIC NUCLEI: POST-NEWTONIAN SIMULATIONS , 2008, 0812.2756.

[3]  Guillermo Marcus Martinez,et al.  On Buffer Management Strategies for High Performance Computing with Reconfigurable Hardware , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[4]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[5]  Toshikazu Ebisuzaki,et al.  Performance Analysis of High-Accuracy Tree Code Based on the Pseudoparticle Multipole Method , 2004 .

[6]  J. Monaghan Smoothed particle hydrodynamics , 2005 .

[7]  D. M. Erritt EFFICIENT MERGER OF BINARY SUPERMASSIVE BLACK HOLES IN NON-AXISYMMETRIC GALAXIES , 2006 .

[8]  Junichiro Makino,et al.  Performance analysis of direct N-body calculations , 1988 .

[9]  K. Hensel Journal für die reine und angewandte Mathematik , 1892 .

[10]  Andreas Burkert,et al.  Special, hardware accelerated, parallel SPH code for galaxy evolution. , 2007 .

[11]  J. W. Eastwood,et al.  On the clustering of particles in an expanding Universe , 1981 .

[12]  R. Miller,et al.  Stellar Dynamics in a Discrete Phase Space , 1968 .

[13]  Sverre J. Aarseth,et al.  Gravitational N-Body Simulations , 2003 .

[14]  Reinhard Männer,et al.  Rapid development of high performance floating-point pipelines for scientific simulation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[15]  Rainer Spurzem,et al.  Long-Term Evolution of Massive Black Hole Binaries , 2002, astro-ph/0212459.

[16]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[17]  Caltech,et al.  Long-Term Evolution of Massive Black Hole Binaries , 2002, astro-ph/0212459.

[18]  H.M.P. Couchman,et al.  Hydra: a parallel adaptive grid code , 1997 .

[19]  J. Robert Buchler,et al.  The Numerical Modelling of Nonlinear Stellar Pulsations , 1990 .

[20]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[21]  Robert G. Belleman,et al.  High Performance Direct Gravitational N-body Simulations on Graphics Processing Units , 2007, ArXiv.

[22]  R. Spurzem Direct N-body simulations , 1999, astro-ph/9906154.

[23]  J. Makino,et al.  PROGRAPE-1: a programmable special-purpose computer for many-body simulations , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[24]  D. Merritt,et al.  Performance Analysis of Direct N-Body Algorithms on Special-Purpose Supercomputers , 2006, astro-ph/0608125.

[25]  Hubert Nguyen,et al.  GPU Gems 3 , 2007 .

[26]  J. Monaghan Simulating Free Surface Flows with SPH , 1994 .

[27]  Viktor K. Prasanna,et al.  Accelerating Molecular Dynamics Simulations with Reconfigurable Computers , 2008, IEEE Transactions on Parallel and Distributed Systems.

[28]  W. Benz Smooth Particle Hydrodynamics: A Review , 1990 .

[29]  T. S. van Albada,et al.  Experimental Stellar Dynamics for Systems with Axial Symmetry , 1977 .

[30]  M. Milosavljevic,et al.  Formation of Galactic Nuclei , 2001, astro-ph/0103350.

[31]  Adam D. Myers,et al.  Developing and Deploying Advanced Algorithms to Novel Supercomputing Hardware , 2007 .

[32]  Ajaq Ahmad,et al.  Random Force in Gravitational Systems , 1973 .

[33]  V. Springel The Cosmological simulation code GADGET-2 , 2005, astro-ph/0505010.

[34]  Naohito Nakasato,et al.  Fast Simulations of Gravitational Many-body Problem on RV770 GPU , 2010 .

[35]  J. Makino,et al.  GRAPE-6A: A Single-Card GRAPE-6 for Parallel PC-GRAPE Cluster Systems , 2005, astro-ph/0504407.

[36]  David Merritt,et al.  Systolic and Hyper-Systolic Algorithms for the Gravitational N-Body Problem, with an Application to Brownian Motion , 2001, ArXiv.

[37]  Heidelberg,et al.  The study of colliding molecular clumps evolution , 2006, astro-ph/0701377.

[38]  Makoto Taiji,et al.  Scientific simulations with special purpose computers - the GRAPE systems , 1998 .

[39]  Rainer Spurzem,et al.  Long-Term Evolution of Massive Black Hole Binaries. II. Binary Evolution in Low-Density Galaxies , 2005, astro-ph/0507260.

[40]  Tsuyoshi Hamada,et al.  Astrophysical hydrodynamics simulations on a reconfigurable system , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).

[41]  Hans-Peter Bischof,et al.  EFFICIENT MERGER OF BINARY SUPERMASSIVE BLACK HOLES IN NON- AXISYMMETRIC GALAXIES , 2006 .

[42]  S. Aarseth From NBODY1 to NBODY6: The Growth of an Industry , 1999 .

[43]  Tsuyoshi Hamada,et al.  The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units , 2007 .

[44]  Hsi-Yu Schive,et al.  Graphic-card cluster for astrophysics (GraCCA) - Performance tests , 2007, 0707.2991.

[45]  John Dubinski A parallel tree code , 1996 .

[46]  Andreas Burkert,et al.  An FPGA-based hardware coprocessor for SPH computations. , 2007 .

[47]  Junichiro Makino,et al.  Evolution of Massive Black Hole Binaries , 2003 .

[48]  Michael M. Resch,et al.  High performance computing in science and engineering , 2005, 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'05).

[49]  Junichiro Makino,et al.  On a Hermite Integrator with Ahmad-Cohen Scheme for Gravitational Many-Body Problems , 1992 .

[50]  Volodymyr Kindratenko,et al.  Mitrion-C Application Development on SGI Altix 350/RC100 , 2007 .

[51]  Robert J. Brunner,et al.  Mitrion-C Application Development on SGI Altix 350/RC100 , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[52]  Junichiro Makino An efficient parallel algorithm for O(N2) direct summation method and its variations on distributed-memory parallel machines , 2002 .

[53]  J. Makino Postcollapse Evolution of Globular Clusters , 1996, astro-ph/9608160.

[54]  Guirong Liu,et al.  Smoothed Particle Hydrodynamics: A Meshfree Particle Method , 2003 .

[55]  Toshikazu Ebisuzaki,et al.  A special-purpose computer for gravitational many-body problems , 1990, Nature.