Large scale ab initio calculations based on three levels of parallelization

Abstract We suggest and implement a parallelization scheme based on an efficient multiband eigenvalue solver, called the locally optimal block preconditioned conjugate gradient ( lobpcg ) method, and using an optimized three-dimensional (3D) fast Fourier transform (FFT) in the ab initio plane-wave code abinit . In addition to the standard data partitioning over processors corresponding to different k-points, we introduce data partitioning with respect to blocks of bands as well as spatial partitioning in the Fourier space of coefficients over the plane waves basis set used in abinit . This k-points-multiband-FFT parallelization avoids any collective communications on the whole set of processors relying instead on one-dimensional communications only. For a single k-point, super-linear scaling is achieved for up to 100 processors due to an extensive use of hardware-optimized blas , lapack and scalapack routines, mainly in the lobpcg routine. We observe good performance up to 200 processors. With 10 k-points our three-way data partitioning results in linear scaling up to 1000 processors for a practical system used for testing.

[1]  Kresse,et al.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. , 1996, Physical review. B, Condensed matter.

[2]  Roberto Car,et al.  Carbon phase diagram from ab initio molecular dynamics. , 2005, Physical review letters.

[3]  Sameer Kumar,et al.  Scalable fine‐grained parallelization of plane‐wave–based ab initio molecular dynamics for large supercomputers , 2004, J. Comput. Chem..

[4]  Giulia Galli,et al.  Melting of lithium hydride under pressure. , 2003, Physical review letters.

[5]  Chris-Kriton Skylaris,et al.  Introducing ONETEP: linear-scaling density functional simulations on parallel computers. , 2005, The Journal of chemical physics.

[6]  Stefan Goedecker,et al.  An efficient 3-dim FFT for plane wave electronic structure calculations on massively parallel machines composed of multiprocessor nodes , 2003 .

[7]  Alessandro Curioni,et al.  Dual-level parallelism for ab initio molecular dynamics: Reaching teraflop performance with the CPMD code , 2005, Parallel Comput..

[8]  P. Pulay Convergence acceleration of iterative sequences. the case of scf iteration , 1980 .

[9]  Xavier Gonze,et al.  Implementation of the projector augmented-wave method in the ABINIT code: Application to the study of iron under pressure , 2008 .

[10]  Juan C. Meza,et al.  A constrained optimization algorithm for total energy minimization in electronic structure calculations , 2005, J. Comput. Phys..

[11]  Georg Kresse,et al.  Competing stabilization mechanism for the polar ZnO(0001)-Zn surface , 2003 .

[12]  Matthieu Verstraete,et al.  First-principles computation of material properties: the ABINIT software project , 2002 .

[13]  Arash A. Mostofi,et al.  Implementation of linear‐scaling plane wave density functional theory on parallel computers , 2006 .

[14]  Allan,et al.  Solution of Schrödinger's equation for large systems. , 1989, Physical review. B, Condensed matter.

[15]  Tobias J. Hagge,et al.  Physics , 1929, Nature.

[16]  G. Kresse,et al.  Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set , 1996 .

[17]  Blöchl,et al.  Projector augmented-wave method. , 1994, Physical review. B, Condensed matter.

[18]  Dean L. Preston,et al.  High-pressure melting of lead , 2006 .

[19]  Car,et al.  Ab initio molecular dynamics study of first-order phase transitions: melting of silicon. , 1995, Physical review letters.

[20]  E. Davidson The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices , 1975 .

[21]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[22]  A. Knyazev,et al.  A Geometric Theory for Preconditioned Inverse Iteration. III:A Short and Sharp Convergence Estimate for Generalized EigenvalueProblems. , 2001 .

[23]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[24]  Chris G. Van de Walle,et al.  Surface reconstructions on InN and GaN polar and nonpolar surfaces , 2007 .

[25]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[26]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[27]  T. Arias,et al.  Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and co , 1992 .

[28]  Y. Saad,et al.  Parallel self-consistent-field calculations via Chebyshev-filtered subspace acceleration. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Olga Dulub,et al.  Novel stabilization mechanism on polar surfaces: ZnO(0001)-Zn. , 2003, Physical review letters.

[30]  Siegfried Schmauder,et al.  Comput. Mater. Sci. , 1998 .

[31]  A. Zunger,et al.  A new method for diagonalising large matrices , 1985 .

[32]  L. Kleinman,et al.  Self-consistent calculations of the energy bands and bonding properties of B sub 12 C sub 3 , 1990 .

[33]  D. Vanderbilt,et al.  Soft self-consistent pseudopotentials in a generalized eigenvalue formalism. , 1990, Physical review. B, Condensed matter.

[34]  Jack Dongarra,et al.  Templates for the Solution of Algebraic Eigenvalue Problems , 2000, Software, environments, tools.

[35]  Andrew Knyazev,et al.  Preconditioned Eigensolvers - an Oxymoron? , 1998 .

[36]  A. Curioni,et al.  Car-Parrinello molecular dynamics on massively parallel computers. , 2005, Chemphyschem : a European journal of chemical physics and physical chemistry.

[37]  P. Hohenberg,et al.  Inhomogeneous Electron Gas , 1964 .

[38]  Larson,et al.  Ab initio theory of the Si(111)-(7 x 7) surface reconstruction: A challenge for massively parallel computation. , 1992, Physical review letters.