Parallelization, Vectorization and Visualization of Large Scale Plasma Particle Simulations and Its Application to Studies of Intense Laser Interactions

The use of a three-dimensional PIC (Particle-in-Cell) simulation is dispensable in the studies of nonlinear plasma physics, such as ultra-intense laser interactions with plasmas. The three-dimensional simulation requires a large number of particles more than 107 particles. It is therefore very important to develop a parallelization and vectorization scheme of the PIC code and a visualization method of huge simulation data. In this paper we present a new parallelization scheme suitable for a present day supercomputer and a construction method of scientific color animations to analyze simulation data. We also discuss the advantage of the Abe-Nishihara vectorization method for a large scale PIC simulations. Most of supercomputers in present day consists of multi nodes and each node has multi processors with a sheared memory. We have developed a new parallelization scheme in which domain decomposition is applied among nodes and particle decomposition is used for processors within a nodes. The domain decomposition in PIC requires the exchange of two kinds of data between neighboring domains. One is particle data, such as particle position and velocity, when a particle crosses the boundary between the neighboring domains. The other is field data, such as electric field intensity and current density in the boundary region. In the three-dimensional Electro-magnetic PIC forty two-dimensional variables are transferred to the neighboring domain for each boundary surface. MPI (Message Passive Interface) has been used for the transmission of these data between the nodes. The particle and field data are respectively stored once in one-dimensional data and they are then sent to the other node. This reduces the number of communication. The particle decomposition is performed by using auto-parallelization of do-loop. We measured the scalability of the layered parallelization scheme for the particle number of 25,600,000 and the mesh number of 128×128×128 with the use of sixteen processors of NEC SX-5. The layered parallelization is shown to provide a scalable acceleration of computation for the large system of PIC simulations.