A Normalization Scheme for the Non-symmetric s-Step Lanczos Algorithm

The Lanczos algorithm is among the most frequently used techniques for computing a few dominant eigenvalues of a large sparse non-symmetric matrix. When variants of this algorithm are implemented on distributed-memory computers, the synchronization time spent in computing dot products is increasingly limiting the parallel scalability. The goal of s-step algorithms is to reduce the harmful influence of dot products on the parallel performance by grouping several of these operations for joint execution; thus, plummeting synchronization time when using a large number of processes. This paper extends the non-symmetric s-step Lanczos method introduced by Kim and Chronopoulos (J. Comput. Appl. Math., 42(3), 357—374, 1992) by a novel normalization scheme. Compared to the unnormalized algorithm, the normalized variant improves numerical stability and reduces the possibility of breakdowns.

[1]  Ching-Hsien Hsu,et al.  Methods and Tools of Parallel Programming Multicomputers - Second Russia-Taiwan Symposium, MTPP 2010, Vladivostok, Russia, May 16-19, 2010, Revised Selected Papers , 2010, MTPP.

[2]  William Gropp,et al.  Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.

[3]  Sverker Holmgren,et al.  Communication-Efficient Algorithms for Numerical Quantum Dynamics , 2010, PARA.

[4]  H. Martin Bücker,et al.  Reducing global synchronization in the biconjugate gradient method , 1999 .

[5]  Vicente Hernández,et al.  SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems , 2005, TOMS.

[6]  Tianruo Yang Parallel Numerical Computation with Applications , 1999 .

[7]  Anthony T. Chronopoulos,et al.  A class of Lanczos-like algorithms implemented on parallel computers , 1991, Parallel Comput..

[8]  Wim Vanroose,et al.  Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines , 2013, SIAM J. Sci. Comput..

[9]  James Demmel,et al.  A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-Step Krylov Subspace Methods , 2014, SIAM J. Matrix Anal. Appl..

[10]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[11]  Tae-Hee Kim,et al.  A Study on the Efficient Parallel Block Lanczos Method , 2004, CIS.

[12]  H. Martin Bücker,et al.  A Variant of the Biconjugate Gradient Metho Suitable for Massively Parallel Computing , 1997, IRREGULAR.

[13]  Anthony T. Chronopoulos A class of parallel iterative methods implemented on multiprocessors , 1987 .

[14]  Wim Vanroose,et al.  Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm , 2014, Parallel Comput..

[15]  Anthony T. Chronopoulos,et al.  s-step iterative methods for symmetric linear systems , 1989 .

[16]  Yuxi Fu,et al.  Computational and Information Science, First International Symposium, CIS 2004, Shanghai, China, December 16-18, 2004, Proceedings , 2004, CIS.

[17]  James Demmel,et al.  Avoiding Communication in Two-Sided Krylov Subspace Methods , 2011 .

[18]  Anthony T. Chronopoulos,et al.  Parallel Iterative S-Step Methods for Unsymmetric Linear Systems , 1996, Parallel Comput..

[19]  Anthony T. Chronopoulos,et al.  An efficient nonsymmetric Lanczos method on parallel vector computers , 1992 .

[20]  James Demmel,et al.  Avoiding Communication in Nonsymmetric Lanczos-Based Krylov Subspace Methods , 2013, SIAM J. Sci. Comput..

[21]  Yuefan Deng,et al.  Applied Parallel Computing , 2012 .

[22]  H. Martin Bücker,et al.  A Parallel Version of the Quasi-Minimal Residual Method, Based on Coupled Two-Term Recurrences , 1996, PARA.

[23]  James Demmel,et al.  Minimizing communication in sparse matrix solvers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[24]  John Van Rosendale Minimizing Inner Product Data Dependencies in Conjugate Gradient Iteration , 1983, ICPP.

[25]  José D. P. Rolim,et al.  Solving Irregularly Structured Problems in Parallel , 1997, Lecture Notes in Computer Science.

[26]  Kristján Jónasson,et al.  Applied Parallel and Scientific Computing , 2010, Lecture Notes in Computer Science.

[27]  Sun Kyung Kim Efficient Biorthogonal Lanczos Algorithm on Message Passing Parallel Computer , 2010, MTPP.