Scaling to 150K cores: recent algorithm and performance engineering developments enabling XGC1 to run at scale

Particle-in-cell (PIC) methods have proven to be effective in discretizing the Vlasov-Maxwell system of equations describing the core of toroidal burning plasmas for many decades. Recent physical understanding of the importance of edge physics for stability and transport in tokamaks has lead to development of the first fully toroidal edge PIC code – XGC1. The edge region poses special problems in meshing for PIC methods due to the lack of closed flux surfaces, which makes field-line following meshes and coordinate systems problematic. We present a solution to this problem with a semi-field line following mesh method in a cylindrical coordinate system. Additionally, modern supercomputers require highly concurrent algorithms and implementations, with all levels of the memory hierarchy being efficiently utilized to realize optimal code performance. This paper presents a mesh and particle partitioning method, suitable to our meshing strategy, for use on highly concurrent cache-based computing platforms.