The improved BiCG method for large and sparse linear systems on parallel distributed memory architectures

For the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices, we propose an improved version of the BiConjugate Gradient method (IBiCG) method based on [5, 6] by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a coupled two-term procedure that generates Lanczos vectors scaled to unit length. The algorithm is derived such that all inner products, matrix-vector multiplications and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. Therefore, the cost of global communication on parallel distributed memory computers can be significantly reduced. The resulting IBiCG algorithm maintains the favorable properties of the Lanczos process while not increasing computational costs. Data distribution suitable for both irregularly and regularly structured matrices based on the analysis of the non-zero matrix elements is presented. Communication scheme is supported by overlapping execution of computation and communication to reduce waiting times. The efficiency of this method is demonstrated by numerical experimental results carried out on a massively parallel distributed memory system.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  H. Martin Bücker,et al.  A Variant of the Biconjugate Gradient Metho Suitable for Massively Parallel Computing , 1997, IRREGULAR.

[3]  G. Golub,et al.  Iterative solution of linear systems , 1991, Acta Numerica.

[4]  R. Fletcher Conjugate gradient methods for indefinite systems , 1976 .

[5]  Jack J. Dongarra,et al.  Solving linear systems on vector and shared memory computers , 1990 .

[6]  H. Martin Bücker,et al.  A Parallel Version of the Unsymmetric Lanczos Algorithm and its Application to QMR , 1996 .

[7]  Zhishun A. Liu,et al.  A Look Ahead Lanczos Algorithm for Unsymmetric Matrices , 1985 .

[8]  H. V. D. Vorst,et al.  Reducing the effect of global communication in GMRES( m ) and CG on parallel distributed memory computers , 1995 .

[9]  D. Taylor Analysis of the Look Ahead Lanczos Algorithm. , 1982 .

[10]  Achim Basermann,et al.  Preconditioned CG Methods for Sparse Matrices on Massively Parallel Machines , 1997, Parallel Comput..

[11]  C. Lanczos Solution of Systems of Linear Equations by Minimized Iterations1 , 1952 .

[12]  Claude Pommerell,et al.  Solution of large unsymmetric systems of linear equations , 1992 .

[13]  H. Martin Bücker,et al.  A Parallel Version of the Quasi-Minimal Residual Method, Based on Coupled Two-Term Recurrences , 1996, PARA.

[14]  E. Sturler A PARALLEL VARIANT OF GMRES(m) , 1991 .

[15]  R. Freund,et al.  QMR: a quasi-minimal residual method for non-Hermitian linear systems , 1991 .

[16]  Roland W. Freund,et al.  An Implementation of the QMR Method Based on Coupled Two-Term Recurrences , 1994, SIAM J. Sci. Comput..