Communication-efficient parallel generic pairwise elimination

The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. In this paper, we consider the parallel complexity of generic pairwise elimination, special cases of which include Gaussian elimination with pairwise pivoting, Gaussian elimination over a finite field, generic Neville elimination and Givens reduction. We develop a new block-recursive, communication-efficient BSP algorithm for generic pairwise elimination.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  J. Ortega Introduction to Parallel and Vector Solution of Linear Systems , 1988, Frontiers of Computer Science.

[3]  K. A. Gallivan,et al.  Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[4]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[5]  Jean-Guillaume Dumas,et al.  FFPACK: finite field linear algebra package , 2004, ISSAC '04.

[6]  William F. McColl,et al.  Scalable Computing , 1995, Computer Science Today.

[7]  D Voss,et al.  Frontiers of computer science. , 1991, Science.

[8]  Danny C. Sorensen,et al.  Analysis of Pairwise Pivoting in Gaussian Elimination , 1985, IEEE Transactions on Computers.

[9]  David J. Evans,et al.  Bulk-synchronous Parallel Algorithms for Qr and Qz Matrix Factorisation , 1997, Parallel Algorithms Appl..

[10]  Juan Manuel Peña,et al.  Total positivity, QR factorization, and Neville elimination , 1993 .

[11]  Michael Clausen,et al.  Algebraic complexity theory , 1997, Grundlehren der mathematischen Wissenschaften.

[12]  J. Demmel Trading Off Parallelism and Numerical Stability , 1992 .

[13]  Jonathan M. Nash,et al.  Abstract Machine Models for Parallel and Distributed Computing , 1996 .

[14]  C. Siva Ram Murthy,et al.  New Parallel Algorithms for Direct Solution of Linear Equations , 2000 .

[15]  Michel Cosnard,et al.  Optimal algorithms for parallel Givens factorization on a coarse-grained PRAM , 1994, JACM.

[16]  A. Tiskin Bulk-Synchronous Parallel Gaussian Elimination , 2002 .

[17]  David J. Kuck,et al.  On Stable Parallel Linear System Solvers , 1978, JACM.

[18]  L. Trefethen,et al.  Average-case stability of Gaussian elimination , 1990 .

[19]  Marc Hofmann,et al.  Pipeline Givens sequences for computing the QR decomposition on a EREW PRAM , 2006, Parallel Comput..

[20]  Fatima Abu Salem A New Sparse Gaussian Elimination Algorithm and the Niederreiter Linear System for Trinomials over F2 , 2005, Computing.

[21]  Hagit Attiya,et al.  Wiley Series on Parallel and Distributed Computing , 2004, SCADA Security: Machine Learning Concepts for Intrusion Detection and Prevention.

[22]  José Ranilla,et al.  Neville elimination: a study of the efficiency using checkerboard partitioning , 2004 .

[23]  J. Hopcroft,et al.  Triangular Factorization and Inversion by Fast Matrix Multiplication , 1974 .

[24]  Dror Irony,et al.  Trading Replication for Communication in Parallel Distributed-Memory Dense Solvers , 2002, Parallel Process. Lett..

[25]  Rob H. Bisseling,et al.  Parallel scientific computation - a structured approach using BSP and MPI , 2004 .

[26]  Alok Aggarwal,et al.  Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..

[27]  Arnold Schönhage Unitäre Transformationen großer Matrizen , 1972 .

[28]  Gerhard Goos,et al.  Computer Science Today: Recent Trends and Developments , 1995 .

[29]  Jagdish J. Modi Parallel algorithms and matrix computation , 1988 .

[30]  Charles R. Johnson,et al.  Elementary bidiagonal factorizations , 1999 .

[31]  H. T. Kung,et al.  Numerically Stable Solution of Dense Systems of Linear Equations Using Mesh-Connected Processors , 1984 .

[32]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[33]  Jean-Guillaume Dumas,et al.  Finite field linear algebra subroutines , 2002, ISSAC '02.

[34]  W. F. McColl A BSP realisation of Strassen's algorithm , 1997 .

[35]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[36]  Sivan Toledo Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..