Efficient parallel solution of linear systems

The most efficient known parallel algorithms for inversion of a nonsingular n × n matrix A or solving a linear system Ax = b over the rationals require &Ogr;(log n)<supscrpt>2</supscrpt> time and M(n)n<supscrpt>0.5</supscrpt> processors (where M(n) is the number of processors required in order to multiply two n × n rational matrices in time &Ogr;(log n).) Furthermore, all known polylog time algorithms for those problems are <italic>unstable</italic>: they require the calculation to be done with perfect precision; otherwise they give no results at all. This paper describes parallel algorithms that have good numerical stability and remain efficient as n grows large. In particular, we describe a quadratically convergent iterative method that gives the inverse (within the relative precision 2<supscrpt>-n<supscrpt>O(1)</supscrpt></supscrpt>) of an n × n rational matrix A with condition ≤ n<supscrpt>0(1)</supscrpt> in &Ogr;(log n)<supscrpt>2</supscrpt> time using M(n) processors. This is the optimum processor bound and the factor n<supscrpt>0.5</supscrpt> improvement of known processor bounds for polylog time matrix inversion. It is the first known polylog time algorithm that is numerically stable. The algorithm relies on our method of computing an approximate inverse of A that involves &Ogr;(log n) parallel steps and n<supscrpt>2</supscrpt> processors. Also, we give a parallel algorithm for solution of a linear system Ax = b with a sparse n × n symmetric positive definite matrix A. If the graph G(A) (which has n vertices and has an edge for each nonzero entry of A) is s(n)-separable, then our algorithm requires only &Ogr;((log n)(log s(n))<supscrpt>2</supscrpt>) time and |E| + M(s(n)) processors. The algorithm computes a recursive factorization of A so that the solution of any other linear system Ax = b′ with the same matrix A requires only &Ogr;(log n log s(n)) time and |E| + s(n)<supscrpt>2</supscrpt> processors.

[1]  G. Schulz Iterative Berechung der reziproken Matrix , 1933 .

[2]  H. Hotelling Further Points on Matrix Calculation and Simultaneous Equations , 1943 .

[3]  H. Hotelling Some New Methods in Matrix Calculation , 1943 .

[4]  J. Gillis,et al.  Matrix Iterative Analysis , 1961 .

[5]  Alston S. Householder,et al.  The Theory of Matrices in Numerical Analysis , 1964 .

[6]  J. H. Wilkinson The algebraic eigenvalue problem , 1966 .

[7]  H. Keller,et al.  Analysis of Numerical Methods , 1969 .

[8]  Louis A. Hageman,et al.  Iterative Solution of Large Linear Systems. , 1971 .

[9]  K. Ramachandra,et al.  Vermeidung von Divisionen. , 1973 .

[10]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[11]  L. Csanky,et al.  Fast parallel matrix inversion algorithms , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[12]  Allan Borodin,et al.  The computational complexity of algebraic and numeric problems , 1975, Elsevier computer science library.

[13]  L. Csanky,et al.  Fast Parallel Matrix Inversion Algorithms , 1976, SIAM J. Comput..

[14]  D. Rose,et al.  Generalized nested dissection , 1977 .

[15]  R. Tarjan,et al.  A Separator Theorem for Planar Graphs , 1977 .

[16]  Franco P. Preparata,et al.  An Improved Parallel Processor Bound in Fast Matrix Inversion , 1978, Inf. Process. Lett..

[17]  Kendall E. Atkinson An introduction to numerical analysis , 1978 .

[18]  J. Hopcroft,et al.  Fast parallel matrix and GCD computations , 1982, FOCS 1982.

[19]  Allan Borodin,et al.  Fast parallel matrix and GCD computations , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[20]  Don Coppersmith,et al.  On the Asymptotic Complexity of Matrix Multiplication , 1982, SIAM J. Comput..

[21]  Leslie G. Valiant,et al.  Fast Parallel Computation of Polynomials Using Few Processors , 1983, SIAM J. Comput..

[22]  Gene H. Golub,et al.  Matrix computations , 1983 .

[23]  Victor Y. Pan,et al.  How to Multiply Matrices Faster , 1984, Lecture Notes in Computer Science.

[24]  Stuart J. Berkowitz,et al.  On Computing the Determinant in Small Parallel Time Using a Small Number of Processors , 1984, Inf. Process. Lett..

[25]  A. Bojanczyk Complexity of Solving Linear Systems in Different Models of Computation , 1984 .