Full-neighbor-list based numerical reproducibility method for parallel molecular dynamics simulations

Abstract The numerical nonreproducibility in parallel molecular dynamics (MD) simulations, which relates to the non-associate accumulation of float point data, leads to great challenges for development, debugging and validation. The most common solutions to this problem are using a high-precision data type or operation sorting, but these solutions are accompanied by significant computational overhead. This paper analyzes the sources of nonreproducibility in parallel MD simulations in detail. Two general solutions, namely, sorting by force component value and using an 80-bit long double data type, are implemented and evaluated in LAMMPS. To optimize the computational cost, a full-list based method with operation order sorted by particle distance is proposed, which is inspired by the spatial characteristics of MD simulations. An experiment on a system with constant energy dynamics shows that the new method can ensure reproducibility at any parallelism with an extra 50% computational overhead.

[1]  Chris H. Q. Ding,et al.  Using Accurate Arithmetics to Improve Numerical Reproducibility and Stability in Parallel Applications , 2000, ICS '00.

[2]  Philip Saponaro,et al.  Improving numerical reproducibility and stability in large-scale numerical simulations on GPUs , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[3]  Philippe Langlois,et al.  Efficiency of Reproducible Level 1 BLAS , 2014, SCAN.

[4]  Qian Wang,et al.  Coupling Strategies Investigation of Hybrid Atomistic-Continuum Method Based on State Variable Coupling , 2017 .

[5]  James Demmel,et al.  Parallel Reproducible Summation , 2015, IEEE Transactions on Computers.

[6]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[7]  Jonathan M. Robey,et al.  In search of numerical consistency in parallel programming , 2011, Parallel Comput..

[8]  James Demmel,et al.  Fast Reproducible Floating-Point Summation , 2013, 2013 IEEE 21st Symposium on Computer Arithmetic.

[9]  Siegfried M. Rump,et al.  Ultimately Fast Accurate Summation , 2009, SIAM J. Sci. Comput..

[10]  James Demmel,et al.  Accurate and Efficient Floating Point Summation , 2003, SIAM J. Sci. Comput..

[11]  Duncan Poole,et al.  Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1. Generalized Born , 2012, Journal of chemical theory and computation.

[12]  Jeffrey A. Keasler,et al.  Obtaining identical results with double precision global accuracy on different numbers of processors in parallel particle Monte Carlo simulations , 2013, J. Comput. Phys..

[13]  Sriram Krishnamoorthy,et al.  Effects of floating-point non-associativity on numerical computations on massively multithreaded systems , 2009 .

[14]  William Kahan,et al.  Pracniques: further remarks on reducing truncation errors , 1965, CACM.