论文信息 - Communication-Avoiding Symmetric-Indefinite Factorization - 字舞流文

Communication-Avoiding Symmetric-Indefinite Factorization

We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix $A$ as the product $A=PLTL^{T}P^{T},$ where $P$ is a permutation matrix, $L$ is lower triangular, and $T$ is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. The current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.

James Demmel | Sivan Toledo | Jack J. Dongarra | Ichitaro Yamazaki | Oded Schwartz | Grey Ballard | Inon Peled | Alex Druinsky | Dulceneia Becker | J. Demmel | J. Dongarra | I. Yamazaki | Alex Druinsky | Sivan Toledo | Grey Ballard | O. Schwartz | Inon Peled | Dulcenéia Becker

[1] James Demmel,et al. Communication efficient gaussian elimination with partial pivoting using a shape morphing data layout , 2013, SPAA.

[2] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[3] Fred G. Gustavson,et al. Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..

[4] Jack Dongarra,et al. LAPACK Users' Guide, 3rd ed. , 1999 .

[5] Sivan Toledo,et al. THE SNAP-BACK PIVOTING METHOD FOR SYMMETRIC , 2006 .

[6] M. SIAMJ.. STABILITY OF THE DIAGONAL PIVOTING METHOD WITH PARTIAL PIVOTING , 1995 .

[7] Jeffrey Scott Vitter,et al. Algorithms for parallel memory, I: Two-level memories , 2005, Algorithmica.

[8] Pedro C. Diniz. Exascale Programming Challenges , 2011 .

[9] James Hardy Wilkinson,et al. Reduction of the symmetric eigenproblemAx=λBx and related problems to standard form , 1968 .

[10] James Demmel,et al. CALU: A Communication Optimal LU Factorization Algorithm , 2011, SIAM J. Matrix Anal. Appl..

[11] Greg Henry,et al. Application of a High Performance Parallel Eigensolver to Electronic Structure Calculations , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[12] Robert A. van de Geijn,et al. Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.

[13] L. Trefethen,et al. Average-case stability of Gaussian elimination , 1990 .

[14] G. Miller. On the Solution of a System of Linear Equations , 1910 .

[15] Shang-Hua Teng,et al. Smoothed Analysis of the Condition Numbers and Growth Factors of Matrices , 2003, SIAM J. Matrix Anal. Appl..

[16] Gil Shklarski,et al. Partitioned Triangular Tridiagonalization , 2011, TOMS.

[17] J. O. Aasen. On the reduction of a symmetric matrix to tridiagonal form , 1971 .

[18] Grey Ballard,et al. Avoiding Communication in Dense Linear Algebra , 2013 .

[19] Marsha Fietze,et al. Graduate student , 1955 .

[20] Matemática,et al. Society for Industrial and Applied Mathematics , 2010 .

[21] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..

[22] B. S. Garbow,et al. Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.

[23] Linda Kaufman,et al. The retraction algorithm for factoring banded symmetric matrices , 2007, Numer. Linear Algebra Appl..

[24] Matteo Frigo,et al. Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[25] J. Bunch,et al. Some stable methods for calculating inertia and solving symmetric linear systems , 1977 .

[26] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..

[27] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .

[28] James Demmel,et al. Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[29] Ramesh Subramonian,et al. LogP: a practical model of parallel computation , 1996, CACM.

[30] Nicholas J. Higham,et al. INVERSE PROBLEMS NEWSLETTER , 1991 .

[31] Mei Han An,et al. accuracy and stability of numerical algorithms , 1991 .

[32] Jeffrey Scott Vitter,et al. Optimal disk I/O with parallel block transfer , 1990, STOC '90.

[33] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.

[34] N. Higham. Notes on Accuracy and Stability of Algorithms in Numerical Linear Algebra , 1999 .

[35] James Demmel,et al. LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version , 2012, SIAM J. Matrix Anal. Appl..

[36] Dror Irony,et al. The Snap-Back Pivoting Method for Symmetric Banded Indefinite Matrices , 2006, SIAM J. Matrix Anal. Appl..

[37] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.

[38] Sivan Toledo. Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..

[39] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.

[40] Jennifer A. Scott,et al. Partial factorization of a dense symmetric indefinite matrix , 2012, ACM Trans. Math. Softw..

[41] S. VitterJ.,et al. Algorithms for parallel memory, I , 1994 .