Parallel numerical linear algebra

We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, the singular value decomposition, and generalizations of these to two matrices. We consider dense, band and sparse matrices.

[1]  George A. Geist Reduction of a general matrix to tridiagonal form using a hypercube multiprocessor , 1991 .

[2]  V. Eijkhout Qualitative Properties of the Conjugate Gradient and Lanczos Methods in a Matrix Framework , 1992 .

[3]  Michael T. Heath,et al.  Parallel Algorithms for Sparse Linear Systems , 1991, SIAM Rev..

[4]  J. L. Howland The sign matrix and the separation of matrix eigenvalues , 1983 .

[5]  H. Simon Bisection is not Optimal on Vector Processors , 1989 .

[6]  Tien-Yien Li,et al.  Homotopy method for general l-matrix problems , 1988 .

[7]  Henk A. van der Vorst,et al.  Data transport in Wang's partition method , 1988, Parallel Comput..

[8]  A. N. Malychev Parallel aspects of some spectral problems in linear algebra , 1991 .

[9]  Jeremy Du Croz,et al.  Factorizations of Band Matrices Using Level 3 BLAS , 1990, CONPAR.

[10]  Xian-He Sun,et al.  Parallel Homotopy Algorithm for the Symmetric Tridiagonal Eigenvalue Problem , 1991, SIAM J. Sci. Comput..

[11]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[12]  Douglas Stott Parker,et al.  Analysis of Rounding Methods in Floating-Point Arithmetic , 1977, IEEE Transactions on Computers.

[13]  B. S. Garbow,et al.  Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.

[14]  V. Klema LINPACK user's guide , 1980 .

[15]  David S. Watkins,et al.  Shifting Strategies for the Parallel QR Algorithm , 1994, SIAM J. Sci. Comput..

[16]  J.J.F.M. Schlichting,et al.  Solving bidiagonal systems of linear equations on the CDC CYBER 205 , 1987 .

[17]  L. Reichel,et al.  A Newton basis GMRES implementation , 1994 .

[18]  H. Walker Implementation of the GMRES method using householder transformations , 1988 .

[19]  C. Paige Error Analysis of the Lanczos Algorithm for Tridiagonalizing a Symmetric Matrix , 1976 .

[20]  Thomas F. Coleman,et al.  A parallel triangular solver for distributed-memory multiprocessor , 1988 .

[21]  Yves Robert,et al.  Parallel conjugate gradient-like algorithms for solving sparse nonsymmetric linear systems on a vector multiprocessor , 1989, Parallel Comput..

[22]  Michael T. Heath,et al.  Parallel Algorithms for Matrix Computations , 1987 .

[23]  D. Sorensen,et al.  On the orthogonality of eigenvectors computed by divide-and-conquer techniques , 1991 .

[24]  V. Hari,et al.  On Jacobi methods for singular value decompositions , 1987 .

[25]  Larry Nazareth On the convergence of the cyclic Jacobi method , 1975 .

[26]  Alan J. Laub,et al.  A Parallel Algorithm for the Matrix Sign Function , 1990, Int. J. High Speed Comput..

[27]  Michael T. Heath,et al.  Sparse Cholesky factorization on a local-memory multiprocessor , 1988 .

[28]  Lennart Johnsson Matrix Multiplication on Boolean Cubes using Generic Communication Primitives , 1989 .

[29]  Geoffrey C. Fox,et al.  Solving problems on concurrent processors: vol. 2 , 1990 .

[30]  Niel K. Madsen,et al.  Matrix Multiplication by Diagonals on a Vector/Parallel Processor , 1976, Inf. Process. Lett..

[31]  H. T. Kung New Algorithms and Lower Bounds for the Parallel Evaluation of Certain Rational Expressions and Recurrences , 1976, JACM.

[32]  Gautam M. Shroff A parallel algorithm for the eigenvalues and eigenvectors of a general complex matrix , 1990 .

[33]  G. Rodrigue,et al.  Analysis of the recursive doubling algorithm , 1976 .

[34]  T. J. Dekker,et al.  A floating-point technique for extending the available precision , 1971 .

[35]  H. A. van der Vorst,et al.  GMRESR : A family of nested GMRES methods Report 91-80 , 1991 .

[36]  B. Parlett,et al.  Accurate singular values and differential qd algorithms , 1994 .

[37]  Jochen A. G. Jess,et al.  A Data Structure for Parallel L/U Decomposition , 1982, IEEE Transactions on Computers.

[38]  Jack Dongarra,et al.  Computing the eigenvalues and eigenvectors of a general matrix by reduction to general tridiagonal form , 1990 .

[39]  William Jalby,et al.  Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .

[40]  Anoop Gupta,et al.  Fast Sparse Matrix Factorization on Modern Workstations , 1989 .

[41]  J. Dongarra,et al.  A Parallel Algorithm for the Non-Symmetric Eigenvalue Problem , 1991 .

[42]  Eleanor Chin-Hwa Lee Chu Orthogonal decomposition of dense and sparse matrices on multiprocessors , 1988 .

[43]  Jack J. Dongarra,et al.  Matrix Eigensystem Routines — EISPACK Guide Extension , 1977, Lecture Notes in Computer Science.

[44]  Joseph W. H. Liu The role of elimination trees in sparse factorization , 1990 .

[45]  J. D. Roberts,et al.  Linear model reduction and solution of the algebraic Riccati equation by use of the sign function , 1980 .

[46]  Jack J. Dongarra,et al.  Evaluating Block Algorithm Variants in LAPACK , 1989, PPSC.

[47]  G. W. Stewart,et al.  A parallel implementation of the QR-algorithm , 1987, Parallel Comput..

[48]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[49]  I. Babuvška Numerical stability in problems of linear algebra. , 1972 .

[50]  James Demmel,et al.  Jacobi's Method is More Accurate than QR , 1989, SIAM J. Matrix Anal. Appl..

[51]  Ching-Tien Ho,et al.  Matrix Multiplication on Hypercubes Using Full Bandwith and Constant Storage , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[52]  Henk A. van der Vorst Practical aspects of parallel scientific computing , 1989, Future Gener. Comput. Syst..

[53]  M. Chu A note on the homotopy method for linear algebraic eigenvalue problems , 1988 .

[54]  Laurie A. Hulbert,et al.  Limiting Communication in Parallel Sparse Cholesky Factorization , 1991, SIAM J. Sci. Comput..

[55]  Al Geist Parallel Tridiagonalization of a General Matrix Using Distributed-Memory Multiprocessors , 1989, PPSC.

[56]  J. Meijerink,et al.  Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems , 1981 .

[57]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[58]  C. Bischof,et al.  Robust incremental condition estimation , 1991 .

[59]  Harry Berryman,et al.  Krylov Methods Preconditioned with Incompletely Factored Matrices on the CM-2 , 1990, J. Parallel Distributed Comput..

[60]  Gérard Meurant,et al.  NUMERICAL EXPERIMENTS FOR THE PRECONDITIONED CONJUGATE GRADIENT METHOD ON THE CRAY X-MP/2 , 1984 .

[61]  Anthony T. Chronopoulos Towards efficient parallel implementation of the CG method applied to a class of block tridiagonal linear systems , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[62]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[63]  John R. Gilbert,et al.  Highly Parallel Sparse Cholesky Factorization , 1992, SIAM J. Sci. Comput..

[64]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[65]  H. V. D. Vorst,et al.  High Performance Preconditioning , 1989 .

[66]  Henk A. van der Vorst,et al.  Large tridiagonal and block tridiagonal linear systems on vector and parallel computers , 1987, Parallel Comput..

[67]  Gene H. Golub,et al.  Matrix computations , 1983 .

[68]  Lynn Elliot Cannon,et al.  A cellular computer to implement the kalman filter algorithm , 1969 .

[69]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[70]  Mark K. Seager,et al.  Parallelizing conjugate gradient for the CRAY X-MP , 1986, Parallel Comput..

[71]  David J. Kuck,et al.  A Parallel QR Algorithm for Symmetric Tridiagonal Matrices , 1977, IEEE Transactions on Computers.

[72]  Sartaj Sahni,et al.  Parallel Matrix and Graph Algorithms , 1981, SIAM J. Comput..

[73]  James M. Ortega,et al.  Parallel solution of triangular systems of equations , 1988, Parallel Comput..

[74]  A. Laub,et al.  Rational iterative methods for the matrix sign function , 1991 .

[75]  John R. Rice,et al.  A Grid-Based Subtree-Subcube Assignment Strategy for Solving Partial Differential Equations on Hypercubes , 1992, SIAM J. Sci. Comput..

[76]  H. H. Wang,et al.  A Parallel Method for Tridiagonal Equations , 1981, TOMS.

[77]  G. Meurant The block preconditioned conjugate gradient method on vector computers , 1984 .

[78]  K. A. Gallivan,et al.  Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[79]  James Demmel,et al.  Accurate Singular Values of Bidiagonal Matrices , 1990, SIAM J. Sci. Comput..

[80]  B. Parlett The Symmetric Eigenvalue Problem , 1981 .

[81]  J. Meijerink,et al.  An iterative solution method for linear systems of which the coefficient matrix is a symmetric -matrix , 1977 .

[82]  Christian H. Bischof,et al.  LAPACK Working Note 32: Generalizing Incremental Condition Estimation , 1991 .

[83]  J. G. Lewis,et al.  A fast algorithm for reordering sparse matrices for parallel factorization , 1989 .

[84]  B. Parlett,et al.  Accurate singular values and differential qd algorithms , 1994 .

[85]  Henk A. van der Vorst,et al.  The performance of FORTRAN implementations for preconditioned conjugate gradients on vector computers , 1986, Parallel Comput..

[86]  H. A. van der Vorst,et al.  Solving 3D block bidiagonal linear systems on vector computers , 1989 .

[87]  Jack J. Dongarra,et al.  Solving linear systems on vector and shared memory computers , 1990 .

[88]  Iain S. Duff,et al.  Parallel implementation of multifrontal schemes , 1986, Parallel Comput..

[89]  Alan George,et al.  Communication results for parallel sparse Cholesky factorization on a hypercube , 1989, Parallel Comput..

[90]  Daniel B. Szyld,et al.  A Parallel, Hybrid Algorithm for the Genrealized Eigenproblem , 1987, PPSC.

[91]  A. Sameh On Jacobi and Jacobi-like algorithms for a parallel computer , 1971 .

[92]  W. Kahan Accurate eigenvalues of a symmetric tri-diagonal matrix , 1966 .

[93]  D. Heller Some Aspects of the Cyclic Reduction Algorithm for Block Tridiagonal Linear Systems , 1976 .

[94]  Cleve Ashcraft,et al.  A Fan-In Algorithm for Distributed Sparse Numerical Factorization , 1990, SIAM J. Sci. Comput..

[95]  D. Sorensen,et al.  Block reduction of matrices to condensed forms for eigenvalue computations , 1990 .

[96]  James Demmel,et al.  On a Block Implementation of Hessenberg Multishift QR Iteration , 1989, Int. J. High Speed Comput..

[97]  C. Loan,et al.  A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[98]  S. Doi On parallelism and convergence of incomplete LU factorizations , 1991 .

[99]  S. Lennart Johnsson,et al.  Communication Efficient Basic Linear Algebra Computations on Hypercube Architectures , 1987, J. Parallel Distributed Comput..

[100]  Allan Gottlieb,et al.  Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.

[101]  Claude Pommerell,et al.  Solution of large unsymmetric systems of linear equations , 1992 .

[102]  Joseph W. H. Liu,et al.  Computational models and task scheduling for parallel sparse Cholesky factorization , 1986, Parallel Comput..

[103]  Garry H. Rodrigue Parallel processing for scientific computing , 1989 .

[104]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[105]  Jessup A divide and conquer approach to the nonsymmetric eigenvalue problem , 1991 .

[106]  Ilse C. F. Ipsen,et al.  Solving the Symmetric Tridiagonal Eigenvalue Problem on the Hypercube , 1990, SIAM J. Sci. Comput..

[107]  Michael T. Heath,et al.  Parallel solution of triangular systems on distributed-memory multiprocessors , 1988 .

[108]  Yves Robert The Impact of Vector and Parallel Architectures on the Gaussian Elimination Algorithm , 1991 .

[109]  R. Schreiber,et al.  On the convergence of the cyclic Jacobi method for parallel block orderings , 1989 .

[110]  Volker Mehrmann,et al.  Divide and Conquer Methods for Block Tridiagonal Systems , 1993, Parallel Comput..

[111]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[112]  U. Meier A parallel partition method for solving banded systems of linear equations , 1985 .

[113]  Ken Kennedy,et al.  Fortran D Language Specification , 1990 .

[114]  Cleve Ashcraft,et al.  Comparison of three column-based distributed sparse factorization schemes. Research report , 1990 .

[115]  Elizabeth R. Jessup,et al.  A Divide and Conquer Algorithm for Computing the Singular Value Decomposition , 1987, SIAM Conference on Parallel Processing for Scientific Computing.

[116]  James Demmel,et al.  Stability of block algorithms with fast level-3 BLAS , 1992, TOMS.

[117]  Robert A. van de Geijn,et al.  Two Dimensional Basic Linear Algebra Communication Subprograms , 1993, PPSC.

[118]  J. Barlow Error analysis of update methods for the symmetric eigenvalue problem , 1993 .

[119]  James Demmel,et al.  Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I , 1993, PPSC.

[120]  I. Duff,et al.  The effect of ordering on preconditioned conjugate gradients , 1989 .

[121]  D. Anderson,et al.  Algorithms for minimization without derivatives , 1974 .

[122]  L. Csanky,et al.  Fast Parallel Matrix Inversion Algorithms , 1976, SIAM J. Comput..

[123]  J. Cuppen A divide and conquer method for the symmetric tridiagonal eigenproblem , 1980 .

[124]  Henk A. van der Vorst,et al.  A Vectorizable Variant of some ICCG Methods , 1982 .

[125]  H. A. van der Vorst,et al.  Vectorization of Linear Recurrence Relations , 1989 .

[126]  S. Petiton Parallel subspace method for non-Hermitian eigenproblems on the Connection Machine (CM2) , 1992 .

[127]  R. Brent,et al.  Solving Triangular Systems on a Parallel Computer , 1977 .

[128]  Tien-Yien Li,et al.  Homotopy-determinant algorithm for solving nonsymmetric eigenvalue problems , 1992 .

[129]  Jack Dongarra,et al.  LAPACK Working Note 24: LAPACK Block Factorization Algorithms on the INtel iPSC/860 , 1990 .

[130]  K. Veselic,et al.  A quadratically convergent Jacobi-like method for real matrices with complex eigenvalues , 1979 .

[131]  Danny C. Sorensen,et al.  Implicit Application of Polynomial Filters in a k-Step Arnoldi Method , 1992, SIAM J. Matrix Anal. Appl..

[132]  Robert G. Voigt,et al.  The Solution of Tridiagonal Linear Systems on the CDC STAR 100 Computer , 1975, TOMS.

[133]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[134]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[135]  G. Stewart A Jacobi-Like Algorithm for Computing the Schur Decomposition of a Nonhermitian Matrix , 1985 .

[136]  Bruno Lang Reducing Symmetric Banded Matrices to Tridiagonal Form - A Comparison of a New Parallel Algorithm with Two Serial Algorithms on the iPSC/860 , 1992, CONPAR.

[137]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[138]  M. Morf,et al.  Eigenvalues of a symmetric tridiagonal matrix: A divide-and-conquer approach , 1986 .

[139]  P. J. Eberlein Errata: A Jacobi-Like Method for the Automatic Computation of Eigenvalues and Eigenvectors , 1962 .

[140]  G. Golub,et al.  Iterative solution of linear systems , 1991, Acta Numerica.

[141]  Jan van Leeuwen,et al.  Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity , 1994 .

[142]  H. T. Kung,et al.  I/O complexity: The red-blue pebble game , 1981, STOC '81.

[143]  A. George,et al.  Solution of sparse positive definite systems on a hypercube , 1989 .

[144]  David J. Kuck,et al.  Practical Parallel Band Triangular System Solvers , 1978, TOMS.

[145]  I. Duff,et al.  Direct Methods for Sparse Matrices , 1987 .

[146]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[147]  Arthur Wouk,et al.  Parallel processing and medium-scale multiprocessors , 1989 .

[148]  R. Grimes,et al.  On vectorizing incomplete factorization and SSOR preconditioners , 1988 .

[149]  Robert A. van de Geijn Implementing the qr-algorithm on an array of processors , 1987 .

[150]  Beresford N. Parlett,et al.  Reduction to Tridiagonal Form and Minimal Realizations , 1992, SIAM J. Matrix Anal. Appl..

[151]  M. Paardekooper,et al.  A quadratically convergent parallel Jacobi process for diagonally dominant matrices with distinct eigenvalues , 1990 .

[152]  P. Groen Base- p -cyclic reduction for tridiagonal systems of equations , 1991 .

[153]  Ed Anderson,et al.  LAPACK users' guide - [release 1.0] , 1992 .

[154]  Douglas M. Priest,et al.  Algorithms for arbitrary precision floating point arithmetic , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.

[155]  L. Auslander,et al.  On parallelizable eigensolvers , 1992 .

[156]  Iain S. Duff,et al.  Concurrent Multifrontal Methods: Shared Memory, Cache, and Frontwidth Issues , 1987 .

[157]  Michael T. Heath Hypercube multiprocessors 1987 , 1987 .

[158]  Henk A. van der Vorst,et al.  ICCG and related methods for 3D problems on vector computers , 1989 .

[159]  Ilse C. F. Ipsen,et al.  Improving the Accuracy of Inverse Iteration , 1992, SIAM J. Sci. Comput..

[160]  Cornelis Vuik,et al.  GMRESR: a family of nested GMRES methods , 1994, Numer. Linear Algebra Appl..

[161]  Ivan Slapničar,et al.  Accurate Symmetric Eigenreduction by a Jacobi Method , 1993 .

[162]  Jill P. Mesirov,et al.  An optimal hypercube direct N-body solver on the Connection Machine , 1990, Proceedings SUPERCOMPUTING '90.

[163]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[164]  Patricia J. Eberlein,et al.  On the Schur Decomposition of a Matrix for Parallel Computation , 1985, IEEE Transactions on Computers.

[165]  J. Reid,et al.  Tracking the Progress of the Lanczos Algorithm for Large Symmetric Eigenproblems , 1981 .

[166]  Jack J. Dongarra,et al.  Distribution of mathematical software via electronic mail , 1985, SGNM.

[167]  Tien-Yien Li,et al.  An Algorithm for Symmetric Tridiagonal Eigenproblems: Divide and Conquer with Homotopy Continuation , 1993, SIAM J. Sci. Comput..

[168]  R. Brent,et al.  The Solution of Singular-Value and Symmetric Eigenvalue Problems on Multiprocessor Arrays , 1985 .

[169]  Michael T. Heath,et al.  Modified cyclic algorithms for solving triangular systems on distributed-memory multiprocessors , 1988 .

[170]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[171]  Alan George,et al.  The Evolution of the Minimum Degree Ordering Algorithm , 1989, SIAM Rev..

[172]  Christopher D. Beatie,et al.  Localization criteria and containment for Rayleigh quotient iteration , 1989 .

[173]  Anthony T. Chronopoulos,et al.  s-step iterative methods for symmetric linear systems , 1989 .

[174]  Al Geist,et al.  Finding eigenvalues and eigenvectors of unsymmetric matrices using a distributed-memory multiprocessor , 1990, Parallel Comput..

[175]  E. Stickel Separating eigenvalues using the matrix sign function , 1991 .

[176]  Lothar Reichel,et al.  A parallel implementation of the GMRES method , 1993 .

[177]  H. T. Kung New algorithms and lower bounds for the parallel evaluation of certain rational expressions , 1974, STOC '74.

[178]  Barry W. Peyton,et al.  Progress in Sparse Matrix Methods for Large Linear Systems On Vector Supercomputers , 1987 .

[179]  Y. Saad,et al.  Conjugate gradient-like algorithms for solving nonsymmetric linear systems , 1985 .

[180]  Duncan H. Lawrie,et al.  High Speed Computer and Algorithm Organization , 1977 .

[181]  Ching-Tien Ho,et al.  Optimal communication primitives and graph embeddings on hypercubes , 1990 .

[182]  P. Swarztrauber A parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix , 1993 .

[183]  T. Y. Li,et al.  Solving eigenvalue problems of real nonsymmetric matrices with real homotopies , 1992 .

[184]  Jack J. Dongarra,et al.  A fully parallel algorithm for the symmetric eigenvalue problem , 1985, PPSC.

[185]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[186]  Bruce M. Irons,et al.  A frontal solution program for finite element analysis , 1970 .

[187]  Franklin T. Luk,et al.  On parallel Jacobi orderings , 1989 .

[188]  Y. Saad,et al.  Practical Use of Polynomial Preconditionings for the Conjugate Gradient Method , 1985 .

[189]  David S. Watkins,et al.  Convergence of algorithms of decomposition type for the eigenvalue problem , 1991 .

[190]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[191]  Ware Myers Supercomputing 91 , 1992 .

[192]  L. Csanky,et al.  Fast parallel matrix inversion algorithms , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[193]  E. F. DAzevedo,et al.  Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors , 1992 .

[194]  Ahmed H. Sameh,et al.  A multiprocessor algorithm for the symmetric tridiagonal eigenvalue problem , 1985, PPSC.

[195]  Henk A. van der Vorst Analysis of a parallel solution method for tridiagonal linear systems , 1987, Parallel Comput..

[196]  J. Demmel Trading Off Parallelism and Numerical Stability , 1992 .

[197]  Ahmed Sameh,et al.  On some parallel algorithms on a ring of processors , 1985 .

[198]  Joseph W. H. Liu,et al.  Reordering sparse matrices for parallel elimination , 1989, Parallel Comput..

[199]  J. Ortega Introduction to Parallel and Vector Solution of Linear Systems , 1988, Frontiers of Computer Science.

[200]  Christian H. Bischof,et al.  Computing the singular value decomposition on a distributed system of vector processors , 1987, Parallel Comput..

[201]  Y. Saad Variations on Arnoldi's method for computing eigenelements of large unsymmetric matrices , 1980 .

[202]  J. H. Wilkinson The algebraic eigenvalue problem , 1966 .

[203]  Harold S. Stone,et al.  An Efficient Parallel Algorithm for the Solution of a Tridiagonal Linear System of Equations , 1973, JACM.

[204]  Y. Saad,et al.  Krylov Subspace Methods on Supercomputers , 1989 .

[205]  Robert F. Lucas,et al.  A Parallel Solution Method for Large Sparse Systems of Equations , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[206]  C. Bischof,et al.  A divide-and-conquer method for tridiagonalizing symmetric matrices with repeated eigenvalues , 1994 .

[207]  Yousef Saad Partial Eigensolutions of Large Nonsymmetric Matrices. , 1985 .