Optimization strategy for the VMEC stellarator equilibrium code

VMEC (Variational Moments Equilibrium Code) [1, 2] is the main workhorse for computing three-dimensional MHD equilibria in stellarator experiments such as Wendelstein 7-X (W7-X). There is a great interest in the community for significantly reducing the runtimes of individual VMEC simulations which, for typical setups, can range up to hours of computing time on modern processors. In particular, the ability to enter the regime of ”real-time” diagnostics during the operation of the W7-X machine is considered highly desirable. Here, we present the results from the assessment and prototypical optimization of the computational performance of VMEC and propose a strategy for adapting the serial code to modern multicore-processor architectures. Starting off from the most recent VMEC version 8.49 we shall demonstrate that up to threefold speedups can be readily obtained for the most time-consuming routines by simply eliminating legacy program structures which presumably were dictated by the prevalence of vector supercomputers back in the 1980’s when VMEC was originally written. As a side effect, the code regains readability which had to be sacrificed for achieving high performance on traditional vector processors. Once the structure has been updated, the code is amenable to parallelization using threads (OpenMP) and message pass-