Performance Evaluation of Multithreaded Geant4 Simulations Using an Intel Xeon Phi Cluster

The objective of this study is to evaluate the performances of Intel Xeon Phi hardware accelerators for Geant4 simulations, especially for multithreaded applications. We present the complete methodology to guide users for the compilation of their Geant4 applications on Phi processors. Then, we propose series of benchmarks to compare the performance of Xeon CPUs and Phi processors for aGeant4 example dedicated to the simulation of electron dose point kernels, the TestEm12 example. First, we compare a distributed execution of a sequential version of the Geant4 example on both architectures before evaluating the multithreaded version of the Geant4 example. If Phi processors demonstrated their ability to accelerate computing time (till a factor 3.83) when distributing sequential Geant4 simulations, we do not reach the same level of speedup when considering the multithreaded version of the Geant4 example.

[1]  David R. C. Hill,et al.  Performance analysis with a memory-bound Monte Carlo simulation on Xeon Phi , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).

[2]  J. Hesser,et al.  GMC: a GPU implementation of a Monte Carlo dose calculation based on Geant4 , 2012, Physics in medicine and biology.

[3]  Massimo Bernaschi,et al.  Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations , 2014, Comput. Phys. Commun..

[4]  George Loudos,et al.  A review of the use and potential of the GATE Monte Carlo simulation code for radiation therapy and dosimetry applications. , 2014, Medical physics.

[5]  Dmitry I. Lyakh An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU , 2015, Comput. Phys. Commun..

[6]  David R. C. Hill,et al.  Parallelization Of Monte Carlo Simulations And Submission To A Grid Environment , 2004, Parallel Process. Lett..

[7]  S Stute,et al.  GATE V6: a major enhancement of the GATE simulation platform enabling modelling of CT and radiotherapy , 2011, Physics in medicine and biology.

[8]  Pierre Schweitzer,et al.  A method for porting HEP software Geant4 and ROOT to Intel Xeon Phi hardware accelerator , 2014 .

[9]  Siegfried Benkner,et al.  High-level Support for Hybrid Parallel Execution of C++ Applications Targeting Intel® Xeon Phi™ Coprocessors , 2013, ICCS.

[10]  Tomoyoshi Shimobaba,et al.  Fast computation of computer-generated hologram using Xeon Phi coprocessor , 2013, Comput. Phys. Commun..

[11]  David Brasse,et al.  Implementing Geant4 on GPU for medical applications , 2011, 2011 IEEE Nuclear Science Symposium Conference Record.

[12]  Mamadou Kaba Traoré,et al.  Distribution of random streams for simulation practitioners , 2013, Concurr. Comput. Pract. Exp..

[13]  Leif Lönnblad,et al.  CLHEP: A project for designing a C++ class library for high-energy physics , 1994 .

[14]  Raffaele Tripiccione,et al.  Early Experience on Porting and Running a Lattice Boltzmann Code on the Xeon-Phi Co-Processor , 2013, ICCS.

[15]  Jianbin Fang,et al.  Test-driving Intel Xeon Phi , 2014, ICPE.

[16]  James Reinders,et al.  Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[17]  C Lartizien,et al.  GATE: a simulation toolkit for PET and SPECT. , 2004, Physics in medicine and biology.

[18]  L Maigne,et al.  Comparison of GATE/GEANT4 with EGSnrc and MCNP for electron dose calculations at energies between 15 keV and 20 MeV , 2011, Physics in medicine and biology.

[19]  Andrs Vajda Programming Many-Core Chips , 2011 .

[20]  Jim Jeffers,et al.  Chapter 7 – Offload , 2013 .