Implementing molecular dynamics on hybrid high performance computers - Particle-particle particle-mesh

Abstract The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. In this paper, we present a continuation of previous work implementing algorithms for using accelerators into the LAMMPS molecular dynamics software for distributed memory parallel hybrid machines. In our previous work, we focused on acceleration for short-range models with an approach intended to harness the processing power of both the accelerator and (multi-core) CPUs. To augment the existing implementations, we present an efficient implementation of long-range electrostatic force calculation for molecular dynamics. Specifically, we present an implementation of the particle–particle particle-mesh method based on the work by Harvey and De Fabritiis. We present benchmark results on the Keeneland InfiniBand GPU cluster. We provide a performance comparison of the same kernels compiled with both CUDA and OpenCL. We discuss limitations to parallel efficiency and future directions for improving performance on hybrid or heterogeneous computers.

[1]  Klaus Schulten,et al.  Multilevel summation of electrostatic potentials using graphics processing units , 2009, Parallel Comput..

[2]  M. Deserno,et al.  HOW TO MESH UP EWALD SUMS. II. AN ACCURATE ERROR ESTIMATE FOR THE PARTICLE-PARTICLE-PARTICLE-MESH ALGORITHM , 1998, cond-mat/9807100.

[3]  M J Harvey,et al.  An Implementation of the Smooth Particle Mesh Ewald Method on GPU Hardware. , 2009, Journal of chemical theory and computation.

[4]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[5]  Peng Wang,et al.  Implementing molecular dynamics on hybrid high performance computers - short range forces , 2011, Comput. Phys. Commun..

[6]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[7]  J. D. Gezelter,et al.  Is the Ewald summation still necessary? Pairwise alternatives to the accepted standard for long-range electrostatics. , 2006, The Journal of chemical physics.

[8]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Ericka Stricklin-Parker,et al.  Ann , 2005 .

[11]  T. Darden,et al.  A smooth particle mesh Ewald method , 1995 .

[12]  Christian Holm,et al.  Interlaced P3M algorithm with analytical and ik-differentiation. , 2010, The Journal of chemical physics.

[13]  Rastko Sknepnek,et al.  A Graphics Processing Unit Implementation of Coulomb Interaction in Molecular Dynamics. , 2010, Journal of chemical theory and computation.

[14]  Christian Holm,et al.  How to mesh up Ewald sums. I. A theoretical and numerical comparison of various particle mesh routines , 1998 .

[15]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[16]  C. Sagui,et al.  Multigrid methods for classical molecular dynamics simulations of biomolecules , 2001 .

[17]  P. P. Ewald Die Berechnung optischer und elektrostatischer Gitterpotentiale , 1921 .

[18]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .