Performance Analysis of Parallel N-Body Codes

N-body codes are routinely exploited for simulation studies of physical systems, e.g. in the fields of Computational Astrophysics and Molecular Dynamics. Typically, they require only a moderate amount of run-time memory, but are very demanding in computational power. A detailed analysis of an N-body code performance, in terms of the relative weight of each task of the code, and how such weight is influenced by software or hardware optimisations, is essential in improving such codes. The approach of developing a dedicated device, GRAPE [9], able to provide a very high performance for the computation of the most expensive computational task of this code, has resulted in a dramatic performance leap. We explore on the performance of different versions of parallel N-body codes, where both software and hardware improvements are introduced. The use of GRAPE as a 'force computation accelerator' in a parallel computer architecture, can be seen as an example of Hybrid Architecture, where a number of Special Purpose Device boards help a general purpose (multi)computer to reach a very high performance.