Automating Data Layout Conversion in a Large Cosmological Simulation Code

•Gkernel is: – isolated representative Gadget code kernel (“halo finder”) – stand-alone application, avoids simulation overhead • Node-level optimization study [3] in the frame of the IPCC [4] • Target computing systems: – Knights Corner (KNC), Ivy Bridge (IVB) – Haswell (HSW), Broadwell (BDW), Knights Landing (KNL) •Main changes: – Data layout optimization – Better threading parallelism Figure 2: Tests on one-socket Xeon systems; 240 threads (4 thr./core) for KNC; 128 threads (2 thr./core) for KNL. Performance improvement: up to 19x faster on KNL; 13.6x on KNC, ca. for 2-5x on Xeon.

[1]  Fabio Baruffa,et al.  Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures , 2016, 2017 International Conference on High Performance Computing & Simulation (HPCS).

[2]  V. Springel The Cosmological simulation code GADGET-2 , 2005, astro-ph/0505010.

[3]  Julia L. Lawall,et al.  Coccinelle: 10 Years of Automated Evolution in the Linux Kernel , 2018, USENIX Annual Technical Conference.