Implementation of OpenMP and MPI hybrid parallelization to Monte Carlo dose simulation for particle therapy

We report a successful implementation of the sharedmemory parallelization using OpenMP to the dose calculation by the Monte Carlo particle and heavy-ion transport code PHITS. The OpenMP shared-memory parallelization is better suited than the MPI distributed-memory parallelization which is originally implemented in the PHITS code in the case of human voxel data which requires a large memory space. We have confirmed that the same results can be obtained with and without the shared-memory parallelization by the modified PHITS code. Sufficiently high OpenMP parallelization efficiency (92.4% with 8 parallelized cores within a single node) is achieved even when the special treatment for accessing memory to avoid memory contentions is taken into account. We implement the OpenMP parallelization on top of the MPI parallelization originally included in PHITS so that the modified code works in a hybrid way such that the distributed-memory parallelization for different nodes by MPI and the shared-memory parallelization among the cores in a single node by OpenMP. The performance has been checked on supercomputers: RICC and K computer.