GRAPE-6A: A Single-Card GRAPE-6 for Parallel PC-GRAPE Cluster Systems

In this paper, we describe the design and performance of GRAPE-6A, a special-purpose computer for gravitational many-body simulations. It was designed to be used with a PC cluster, in which each node has one GRAPE-6A. Such a configuration is particularly cost-effective in running parallel tree algorithms. Though the use of parallel tree algorithms was possible with the original GRAPE-6 hardware, it was not very cost-effective since a single GRAPE-6 board was still too fast and too expensive. Therefore, we designed GRAPE-6A as a single PCI card to minimize the reproduction cost and to optimize the computing speed. The peak performance is 130 Gflops for one GRAPE-6A board and 3.1 Tflops for our 24 node cluster. We describe the implementation of the tree, TreePM and individual timestep algorithms on both a single GRAPE-6A system and GRAPE-6A cluster. Using the tree algorithm on our 16-node GRAPE-6A system, we can complete a collisionless simulation with 100 million particles (8000 steps) within 10 days.

[1]  J. W. Eastwood,et al.  On the clustering of particles in an expanding Universe , 1981 .

[2]  Piet Hut,et al.  Use of Supercomputers in Stellar Dynamics , 1986 .

[3]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[4]  Joshua E. Barnes,et al.  A modified tree code: don't laugh; it runs , 1990 .

[5]  Toshikazu Ebisuzaki,et al.  A special-purpose computer for gravitational many-body problems , 1990, Nature.

[6]  Junichiro Makino,et al.  Optimal order and time-step criterion for Aarseth-type N-body integrators , 1991 .

[7]  H. Couchman,et al.  Mesh-refined P3M - A fast adaptive N-body algorithm , 1991 .

[8]  Guohong Xu A new parallel N body gravity solver: TPM , 1994, astro-ph/9409021.

[9]  John Dubinski A parallel tree code , 1996 .

[10]  H.M.P. Couchman,et al.  Hydra: a parallel adaptive grid code , 1997 .

[11]  Tokyo,et al.  The PCI Interface for GRAPE Systems: PCI-HIB , 1997 .

[12]  Toshikazu Ebisuzaki,et al.  GRAPE-4: A Massively Parallel Special-Purpose Computer for Collisional N-Body Simulations , 1997 .

[13]  Philippe P. Brieu,et al.  P4M: a parallel version of P3M , 1998 .

[14]  Makoto Taiji,et al.  Scientific simulations with special purpose computers - the GRAPE systems , 1998 .

[15]  Sverre J. Aarseth,et al.  Star Cluster Simulations: the State of the Art , 1999, astro-ph/9901069.

[16]  Atsushi Kawai,et al.  GRAPE-5: A Special-Purpose Computer for N-Body Simulations , 1999, astro-ph/9909116.

[17]  V. Springel,et al.  GADGET: a code for collisionless and gasdynamical cosmological simulations , 2000, astro-ph/0003162.

[18]  Performance Characteristics of TreePM codes , 2002, astro-ph/0212129.

[19]  P. Hut,et al.  Astrophysical Supercomputing using Particle Simulations , 2003 .

[20]  Toshiyuki Fukushige,et al.  GRAPE-6: Massively-Parallel Special-Purpose Computer for Astrophysical Particle Simulations , 2003, astro-ph/0310702.

[21]  J. Makino,et al.  High-Accuracy Treecode Based on Pseudoparticle Multipole Method , 2003 .

[22]  Sverre J. Aarseth,et al.  Gravitational N-Body Simulations , 2003 .

[23]  John Dubinski,et al.  GOTPM: A Parallel Hybrid Particle-Mesh Treecode , 2004 .

[24]  Atsushi Kawai,et al.  Structure of Dark Matter Halos from Hierarchical Clustering. III. Shallowing of the Inner Cusp , 2004 .

[25]  J. Stadel,et al.  Convergence and scatter of cluster density profiles , 2004, astro-ph/0402267.

[26]  Junichiro Makino,et al.  A Fast Parallel Treecode with GRAPE , 2004 .

[27]  Toshiyuki Fukushige,et al.  PPPM and TreePM Methods on GRAPE Systems for Cosmological N-body Simulations , 2005 .