Molecular dynamics models materials by simulating each individual particle's trajectory. Many-body potentials lead to a more accurate trajectory simulation, and are used in materials science and computational chemistry. We present optimization results for one multi-body potential on a range of vector instruction sets, targeting both CPUs and accelerators like the Intel Xeon Phi. Parallelization of MD simulations is well-studied; by contrast, vectorization is relatively unexplored. Given the prevalence and power of modern vector units, exploiting them is imperative for high performance software. When running on a highly parallel machine, any improvement to the scalar performance is paid back in hundreds or thousands of saved core hours. Vectorization is already commonly used in the optimization or pair potentials; multi-body potentials pose new, unique challenges. Indeed, their optimization pushes the boundaries of current compilers, forcing us to use explicit vectorization techniques for now. In this study, we add an optimized implementation of Tersoff potential to the LAMMPS molecular dynamics simulation package. To reduce the burden of explicit vectorization, we abstract from the specific vector instruction set and desired precision: From one algorithm, we get optimized implementations for many platforms, from SSE4.2 to AVX512, and the Intel Xeon Phi. We compare the kernels across different architectures, and determine suitable architecture-dependent parameters. Our optimizations benefit any architecture, but have a disproportionate effect on the Intel Xeon Phi, which beats the CPU (2xE5-2650) after optimization.
[1]
J. Tersoff,et al.
New empirical approach for the structure and energy of covalent systems.
,
1988,
Physical review. B, Condensed matter.
[2]
Dirk Schmidl,et al.
Assessing the Performance of OpenMP Programs on the Intel Xeon Phi
,
2013,
Euro-Par.
[3]
W. Michael Brown,et al.
Implementing molecular dynamics on hybrid high performance computers - Three-body potentials
,
2013,
Comput. Phys. Commun..
[4]
Peng Wang,et al.
Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals
,
2013,
Comput. Phys. Commun..
[5]
Steven J. Plimpton,et al.
Optimizing legacy molecular dynamics software with directive-based offload
,
2015,
Comput. Phys. Commun..
[6]
Hans-Joachim Bungartz,et al.
Supercomputing for Molecular Dynamics Simulations
,
2015,
SpringerBriefs in Computer Science.
[7]
Miguel Fuentes-Cabrera,et al.
An Evaluation of Molecular Dynamics Performance on the Hybrid Cray XK6 Supercomputer
,
2012,
ICCS.
[8]
Steve Plimpton,et al.
Fast parallel algorithms for short-range molecular dynamics
,
1993
.
[9]
Milind Girkar,et al.
Compiling C/C++ SIMD Extensions for Function and Loop Vectorizaion on Multicore-SIMD Processors
,
2012,
2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.