Towards optimal parallel PM N-body codes: PMFAST

Abstract We present a new parallel PM N-body code named PMFAST that is freely available to the public. PMFAST is based on a two-level mesh gravity solver where the gravitational forces are separated into long and short range components. The decomposition scheme minimizes communication costs and allows tolerance for slow networks. The code approaches optimality in several dimensions. The force computations are local and exploit highly optimized vendor FFT libraries. It features minimal memory overhead, with the particle positions and velocities being the main cost. The code features support for distributed and shared memory parallelization through the use of MPI and OpenMP, respectively. The current release version uses two grid levels on a slab decomposition, with periodic boundary conditions for cosmological applications. Open boundary conditions could be added with little computational overhead. We present timing information and results from a recent cosmological production run of the code using a 37123 mesh with 6.4 × 109 particles. PMFAST is cost-effective, memory-efficient, and is publicly available.

[1]  H. Couchman,et al.  Mesh-refined P3M - A fast adaptive N-body algorithm , 1991 .

[2]  Steven G. Johnson,et al.  FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  John Dubinski,et al.  GOTPM: A Parallel Hybrid Particle-Mesh Treecode , 2004 .

[4]  J. Peacock,et al.  Stable clustering, the halo model and non-linear cosmological power spectra , 2002, astro-ph/0207664.

[5]  Guohong Xu A new parallel N body gravity solver: TPM , 1994, astro-ph/9409021.

[6]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .