Performance optimization of the air shower simulation program for the Cherenkov Telescope Array

The Cherenkov Telescope Array (CTA), currently under construction, is the next-generation instrument in the field of very high energy gamma-ray astronomy. The first data are expected by the end of 2018, while the scientific operations will start in 2022 for a duration of about 30 years. In order to characterize the instrument response to the Cherenkov light emitted when cosmic ray showers develop in the atmosphere, detailed Monte Carlo simulations will be regularly performed in parallel to CTA operation. The estimated CPU time associated to these simulations is very high, of the order of 200 millions HS06 hours per year. Reducing the CPU time devoted to simulations would allow either to reduce infrastructure cost or to better cover the large phase space. In this paper, we focus on the main computing step (70% of the whole CPU time) implemented in the CORSIKA program, and specifically on the mod-ule responsible for the propagation of Cherenkov photons in the atmosphere. We present our preliminary studies about different options of code optimization, with a particular focus on vectorization facilities (SIMD instructions). Our proposals take care, as automatically as possible, of the hardware portability constraints introduced by the grid computing environment that hosts these simulations. Performance evaluation in terms of running-time and accuracy is provided.

[1]  J. Knapp,et al.  CORSIKA: A Monte Carlo code to simulate extensive air showers , 1998 .

[2]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[3]  V. Golev,et al.  Design concepts for the Cherenkov Telescope Array CTA: an advanced facility for ground-based high-energy gamma-ray astronomy , 2011 .

[4]  Danilo Piparo,et al.  Speeding up HEP experiment software with a library of fast and auto-vectorisable mathematical functions , 2014 .

[5]  K. Bernlohr,et al.  Simulation of Imaging Atmospheric Cherenkov Telescopes with CORSIKA and sim_telarray , 2008, 0808.2253.

[6]  James Demmel,et al.  Floating-Point Precision Tuning Using Blame Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[7]  Matthias Kretz,et al.  Extending C++ for explicit data-parallel programming via SIMD vector types , 2015 .

[8]  James Demmel,et al.  Precimonious: Tuning assistant for floating-point precision , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  John McDonald,et al.  An Embedded Domain Specific Language for General Purpose Vectorization , 2017, ISC Workshops.

[10]  Christoph Quirin Lauter A new open-source SIMD vector libm fully implemented with high-level scalar C , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[11]  N. Neyroud,et al.  The Cherenkov Telescope Array production system for Monte Carlo simulations and analysis , 2017 .