ScaLAPACK's MRRR algorithm

The (sequential) algorithm of Multiple Relatively Robust Representations, MRRR, is a more efficient variant of inverse iteration that does not require reorthogonalization. It solves the eigenproblem of an unreduced symmetric tridiagonal matrix T ∈ Rn × n at O(n2) cost. The computed normalized eigenvectors are numerically orthogonal in the sense that the dot product between different vectors is O (n ε), where ε refers to the relative machine precision. This article describes the design of ScaLAPACK's parallel MRRR algorithm. One emphasis is on the critical role of the representation tree in achieving both adequate accuracy and parallel scalability. A second point concerns the favorable properties of this code: subset computation, the use of static memory, and scalability. Unlike ScaLAPACK's Divide & Conquer and QR, MRRR can compute subsets of eigenpairs at reduced cost. And in contrast to inverse iterations which can fail, it is guaranteed to produce a satisfactory answer while maintaining memory scalability. ParEig, the parallel MRRR algorithm for PLAPACK, uses dynamic memory allocation. This is avoided by our code at marginal additional cost. We also use a different representation tree criterion that allows for more accurate computation of the eigenvectors but can make parallelization more difficult.

[1]  Yousef Saad,et al.  Numerical Methods for Electronic Structure Calculations of Materials , 2010, SIAM Rev..

[2]  James Demmel,et al.  Practical experience in the numerical dangers of heterogeneous computing , 1997, TOMS.

[3]  Jack Dongarra,et al.  LAPACK Working Note 37: Two Dimensional Basic Linear Algebra Communication Subprograms , 1991 .

[4]  Inderjit S. Dhillon,et al.  Glued Matrices and the MRRR Algorithm , 2005, SIAM J. Sci. Comput..

[5]  James Demmel,et al.  Accurate Singular Values of Bidiagonal Matrices , 1990, SIAM J. Sci. Comput..

[6]  Elena Breitmoser,et al.  A performance study of the PLAPACK and ScaLAPACK Eigensolvers on HPCx for the standard problem , 2003 .

[7]  Daniel Sánchez-Portal,et al.  Density‐functional method for very large systems with LCAO basis sets , 1997 .

[8]  Inderjit S. Dhillon,et al.  Fernando's solution to Wilkinson's problem: An application of double factorization , 1997 .

[9]  James Demmel,et al.  Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers , 2008, SIAM J. Sci. Comput..

[10]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[11]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[12]  Wilfried N. Gansterer,et al.  Computing Approximate Eigenpairs of Symmetric Block Tridiagonal Matrices , 2003, SIAM J. Sci. Comput..

[13]  Gerard L. G. Sleijpen,et al.  A Jacobi-Davidson Iteration Method for Linear Eigenvalue Problems , 1996, SIAM Rev..

[14]  William Gropp,et al.  Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[15]  Christof Vömel LAPACK WORKING NOTE 194 : A REFINED REPRESENTATION TREE FOR MRRR , 2022 .

[16]  Andrew G. Glen,et al.  APPL , 2001 .

[17]  J. Demmel,et al.  Sun Microsystems , 1996 .

[18]  B. Parlett,et al.  Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices , 2004 .

[19]  Beresford N. Parlett For tridiagonals T replace T with LDL t , 2000 .

[20]  Inderjit S. Dhillon,et al.  Current inverse iteration software can fail , 1998 .

[21]  Robert C. Ward,et al.  A parallel symmetric block-tridiagonal divide-and-conquer algorithm , 2007, TOMS.

[22]  Beresford N. Parlett,et al.  An implementation of the dqds algorithm (positive case) , 2000 .

[23]  Lin-wang Wang,et al.  Solving Schrödinger’s equation around a desired energy: Application to silicon quantum dots , 1994 .

[24]  Christof Vömel,et al.  LAPACK WORKING NOTE 168: PDSYEVR. SCALAPACK’S PARALLEL MRRR ALGORITHM FOR THE SYMMETRIC EIGENVALUE PROBLEM , 2005 .

[25]  T. Arias,et al.  Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and co , 1992 .

[26]  Inderjit S. Dhillon,et al.  The design and implementation of the MRRR algorithm , 2006, TOMS.

[27]  J. Demmel,et al.  On the correctness of some bisection-like parallel eigenvalue algorithms in floating point arithmetic. , 1995 .

[28]  Jaeyoung Choi,et al.  The design of a parallel dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form , 1995, Numerical Algorithms.

[29]  Robert A. van de Geijn,et al.  Using PLAPACK - parallel linear algebra package , 1997 .

[30]  R. C. Whaley,et al.  Parallel and Distributed Scientific Computing , 2000, Handbook on Parallel and Distributed Processing.

[31]  Ilse C. F. Ipsen Computing an Eigenvector with Inverse Iteration , 1997, SIAM Rev..

[32]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[33]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[34]  Anthony Skjellum,et al.  Using MPI: Portable Programming with the Message-Passing Interface , 1999 .

[35]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[36]  Y. Saad,et al.  Finite-difference-pseudopotential method: Electronic structure calculations without a basis. , 1994, Physical review letters.

[37]  James R. Chelikowsky,et al.  Real-space pseudopotential method for computing the electronic properties of periodic systems , 2004 .

[38]  Robert A. van de Geijn,et al.  A Parallel Eigensolver for Dense Symmetric Matrices Based on Multiple Relatively Robust Representations , 2005, SIAM J. Sci. Comput..

[39]  R. Ward,et al.  Performance of Parallel Eigensolvers on Electronic Structure Calculations II , 2005 .

[40]  Bernd G. Pfrommer,et al.  Unconstrained Energy Functionals for Electronic Structure Calculations , 1998 .

[41]  David E. Bernholdt,et al.  High performance computational chemistry: An overview of NWChem a distributed parallel application , 2000 .

[42]  Gerard L. G. Sleijpen,et al.  A Jacobi-Davidson Iteration Method for Linear Eigenvalue Problems , 1996, SIAM J. Matrix Anal. Appl..

[43]  Beresford N. Parlett,et al.  Computations of eigenpair subsets with the MRRR algorithm , 2006, Numer. Linear Algebra Appl..

[44]  P. Alpatov,et al.  PLAPACK Parallel Linear Algebra Package Design Overview , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[45]  G. Kresse,et al.  Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set , 1996 .

[46]  B. Parlett,et al.  Relatively robust representations of symmetric tridiagonals , 2000 .

[47]  I. Dhillon Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector Problem , 1998 .

[48]  Beresford N. Parlett,et al.  The Spectrum of a Glued Matrix , 2009, SIAM J. Matrix Anal. Appl..

[49]  Beresford N. Parlett,et al.  The New qd Algorithms , 1995, Acta Numerica.

[50]  Lin-wang Wang,et al.  Parallel Empirical Pseudopotential Electronic Structure Calculations for Million Atom Systems , 2000 .

[51]  Ilse C. F. Ipsen A history of inverse iteration , 1994 .

[52]  Lin-wang Wang,et al.  Linear combination of bulk bands method for large-scale electronic structure calculations on strained nanostructures , 1999 .

[53]  James Demmel,et al.  ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[54]  Inderjit S. Dhillon,et al.  Orthogonal Eigenvectors and Relative Gaps , 2003, SIAM J. Matrix Anal. Appl..

[55]  James Demmel,et al.  Algorithm 880: A testing infrastructure for symmetric tridiagonal eigensolvers , 2008, TOMS.

[56]  James Demmel,et al.  The Performance of Finding Eigenvalues and Eigenvaectors of Dense Symmetric Matrices on Distributed Memory Computers , 1995, PPSC.

[57]  Jaeyoung Choi,et al.  A Proposal for a Set of Parallel Basic Linear Algebra Subprograms , 1995, PARA.

[58]  S. SIAMJ. A PARALLEL DIVIDE AND CONQUER ALGORITHM FOR THE SYMMETRIC EIGENVALUE PROBLEM ON DISTRIBUTED MEMORY ARCHITECTURES , 1999 .

[59]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[60]  R. C. Whaley,et al.  LAPACK Working Note 94: A User''s Guide to the BLACS v1.0 , 1995 .

[61]  Stanko Tomić,et al.  Parallel multi-band k·p code for electronic structure of zinc blend semiconductor quantum dots , 2006 .

[62]  Henri Casanova,et al.  Parallel and Distributed Scientific Computing: A Numerical Linear Algebra Problem Solving Environment Designer's Perspective , 1999 .

[63]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[64]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .