Parallel Electronic Structure Calculations Using Multiple Graphics Processing Units (GPUs)

We present an implementation of parallel GPU-accelerated GPAW, a density-functional theory (DFT) code based on grid based projector-augmented wave method. GPAW is suitable for large scale electronic structure calculations and capable of scaling to thousands of cores. We have accelerated the most computationally intensive components of the program with CUDA. We will provide performance and scaling analysis of our multi-GPU-accelerated code staring from small systems up to systems with thousands of atoms running on GPU clusters. We have achieved up to 15 times speed-ups on large systems.

[1]  Kresse,et al.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. , 1996, Physical review. B, Condensed matter.

[2]  Ivan S Ufimtsev,et al.  Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation. , 2009, Journal of chemical theory and computation.

[3]  D. Brandt,et al.  Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .

[4]  Filippo Federici Canova,et al.  Computational Physics on Graphics Processing Units , 2012, PARA.

[5]  Xavier Andrade,et al.  Time-dependent density-functional theory in massively parallel computer architectures: the octopus project , 2012, Journal of physics. Condensed matter : an Institute of Physics journal.

[6]  David Kaeli,et al.  Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units , 2009 .

[7]  Koji Yasuda,et al.  Accelerating Density Functional Calculations with Graphics Processing Unit. , 2008, Journal of chemical theory and computation.

[8]  Koji Yasuda,et al.  Two‐electron integral evaluation on the graphics processor unit , 2008, J. Comput. Chem..

[9]  Richard Dronskowski,et al.  Speeding up plane-wave electronic-structure calculations using graphics-processing units , 2011, Comput. Phys. Commun..

[10]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[11]  R. Parr Density-functional theory of atoms and molecules , 1989 .

[12]  Brett M. Bode,et al.  Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units. , 2010, Journal of chemical theory and computation.

[13]  Nicolas Pinto,et al.  PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation , 2009, Parallel Comput..

[14]  A. Zunger,et al.  A new method for diagonalising large matrices , 1985 .

[15]  Weile Jia,et al.  Large scale plane wave pseudopotential density functional theory calculations on GPU clusters , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[16]  N. A. Romero,et al.  Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method , 2010, Journal of physics. Condensed matter : an Institute of Physics journal.

[17]  Ivan S Ufimtsev,et al.  Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation. , 2008, Journal of chemical theory and computation.

[18]  Sullivan,et al.  Real-space multigrid-based approach to large-scale electronic structure calculations. , 1996, Physical review. B, Condensed matter.

[19]  Jean-François Méhaut,et al.  Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. , 2009, The Journal of chemical physics.

[20]  Paulius Micikevicius,et al.  3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.

[21]  Ani Anciaux-Sedrakian,et al.  Accelerating VASP electronic structure calculations using graphic processing units , 2012, J. Comput. Chem..

[22]  Blöchl,et al.  Projector augmented-wave method. , 1994, Physical review. B, Condensed matter.

[23]  Filippo Spiga,et al.  phiGEMM: A CPU-GPU Library for Porting Quantum ESPRESSO on Hybrid Systems , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[24]  Weile Jia,et al.  The analysis of a plane wave pseudopotential density functional theory code on a GPU machine , 2013, Comput. Phys. Commun..

[25]  K. Jacobsen,et al.  Real-space grid implementation of the projector augmented wave method , 2004, cond-mat/0411218.

[26]  T. Arias,et al.  Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and co , 1992 .