论文信息 - Analyzing and improving maximal attainable accuracy in the communication hiding pipelined BiCGStab method

Analyzing and improving maximal attainable accuracy in the communication hiding pipelined BiCGStab method

Abstract Pipelined Krylov subspace methods avoid communication latency by reducing the number of global synchronization bottlenecks and by hiding global communication behind useful computational work. In exact arithmetic pipelined Krylov subspace algorithms are equivalent to classic Krylov subspace methods and generate identical series of iterates. However, as a consequence of the reformulation of the algorithm to improve parallelism, pipelined methods may suffer from severely reduced attainable accuracy in a practical finite precision setting. This work presents a numerical stability analysis that describes and quantifies the impact of local rounding error propagation on the maximal attainable accuracy of the multi-term recurrences in the preconditioned pipelined BiCGStab method. Theoretical expressions for the gaps between the true and computed residual as well as other auxiliary variables used in the algorithm are derived, and the elementary dependencies between the gaps on the various recursively computed vector variables are analyzed. The norms of the corresponding propagation matrices and vectors provide insights in the possible amplification of local rounding errors throughout the algorithm. Stability of the pipelined BiCGStab method is compared numerically to that of pipelined CG on a symmetric benchmark problem. Furthermore, numerical evidence supporting the effectiveness of employing a residual replacement type strategy to improve the maximal attainable accuracy for the pipelined BiCGStab method is provided.

Siegfried Cools | Siegfried Cools

[1] Samuel H. Fuller,et al. Computing Performance: Game Over or Next Level? , 2011, Computer.

[2] Gerard L. G. Sleijpen,et al. Maintaining convergence properties of BiCGstab methods in finite precision arithmetic , 1995, Numerical Algorithms.

[3] Siegfried Cools,et al. Numerical analysis of the maximal attainable accuracy in communication hiding pipelined Conjugate Gradient methods , 2018, 1804.02962.

[4] Wim Vanroose,et al. The communication-hiding pipelined BiCGstab method for the parallel solution of large unsymmetric linear systems , 2016, Parallel Comput..

[5] Anthony T. Chronopoulos,et al. s-step iterative methods for symmetric linear systems , 1989 .

[6] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .

[7] C. Paige. Computational variants of the Lanczos method for the eigenproblem , 1972 .

[8] Gerard L. G. Sleijpen,et al. BiCGstab(l) and other hybrid Bi-CG methods , 1994, Numerical Algorithms.

[9] Jack J. Dongarra,et al. Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[10] C. Paige. Error Analysis of the Lanczos Algorithm for Tridiagonalizing a Symmetric Matrix , 1976 .

[11] Laurence T. Yang,et al. The improved BiCG method for large and sparse linear systems on parallel distributed memory architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[12] Lexing Ying,et al. Sweeping Preconditioner for the Helmholtz Equation: Moving Perfectly Matched Layers , 2010, Multiscale Model. Simul..

[13] Edmond Chow,et al. Fine-Grained Parallel Incomplete LU Factorization , 2015, SIAM J. Sci. Comput..

[14] Henk A. van der Vorst,et al. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..

[15] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[16] Gerard L. G. Sleijpen,et al. Reliable updated residuals in hybrid Bi-CG methods , 1996, Computing.

[17] James Demmel,et al. Avoiding Communication in Nonsymmetric Lanczos-Based Krylov Subspace Methods , 2013, SIAM J. Sci. Comput..

[18] Barry F. Smith,et al. Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations , 1996 .

[19] Zdenek Strakos,et al. Composite convergence bounds based on Chebyshev polynomials and finite precision conjugate gradient computations , 2014, Numerical Algorithms.

[20] Anthony T. Chronopoulos,et al. Block s‐step Krylov iterative methods , 2010, Numer. Linear Algebra Appl..

[21] J. Dongarra,et al. HPCG Benchmark: a New Metric for Ranking High Performance Computing Systems∗ , 2015 .

[22] Gene H. Golub,et al. Calculating the singular values and pseudo-inverse of a matrix , 2007, Milestones in Matrix Computation.

[23] Zdenek Strakos. Effectivity and optimizing of algorithms and programs on the host-computer/array-processor system , 1987, Parallel Comput..

[24] G. Meurant,et al. The Lanczos and conjugate gradient algorithms in finite precision arithmetic , 2006, Acta Numerica.

[25] L.T. Yang,et al. The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures , 2002, Fifth International Conference on Algorithms and Architectures for Parallel Processing, 2002. Proceedings..

[26] C. Paige. Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem , 1980 .

[27] Gerard L. G. Sleijpen,et al. Differences in the Effects of Rounding Errors in Krylov Solvers for Symmetric Indefinite Linear Systems , 2000, SIAM J. Matrix Anal. Appl..

[28] R. Fletcher. Conjugate gradient methods for indefinite systems , 1976 .

[29] William Gropp,et al. Scalable Non-blocking Preconditioned Conjugate Gradient Methods , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[30] A. Greenbaum. Estimating the Attainable Accuracy of Recursively Computed Residual Methods , 1997, SIAM J. Matrix Anal. Appl..

[31] Å. Björck. A bidiagonalization algorithm for solving large and sparse ill-posed systems of linear equations , 1988 .

[32] Mark Hoemmen,et al. Communication-avoiding Krylov subspace methods , 2010 .

[33] E. F. DAzevedo,et al. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors , 1992 .

[34] Christopher C. Paige,et al. The computation of eigenvalues and eigenvectors of very large sparse matrices , 1971 .

[35] Matthew G. Knepley,et al. A stochastic performance model for pipelined Krylov methods , 2016, Concurr. Comput. Pract. Exp..

[36] Jocelyne Erhel,et al. Varying the s in Your s-step GMRES , 2018 .

[37] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .

[38] Anthony T. Chronopoulos. s-Step Iterative Methods for (Non) Symmetric (In) Definite Linear Systems , 1989, PPSC.

[39] Anne Greenbaum,et al. Iterative methods for solving linear systems , 1997, Frontiers in applied mathematics.

[40] H. V. der. Residual Replacement Strategies for Krylov Subspace Iterative Methods for the Convergence of True Residuals , 2000 .

[41] James Demmel,et al. Parallel numerical linear algebra , 1993, Acta Numerica.