论文信息 - Efficiency and scalability of two parallel QR factorization algorithms

Efficiency and scalability of two parallel QR factorization algorithms

Both the Householder QR factorization algorithm and the modified Gram-Schmidt algorithm can be written in terms of matrix-matrix operations using the Compact WY representation. Parallelizations of the resulting algorithms are reviewed and analyzed. For this purpose a general framework for analyzing the scalability of parallel algorithms is presented.<<ETX>>

J. Malard | C. C. Paige | C. Paige | J. Malard

[1] R. A. van de Geijn,et al. Efficient Global Combine Operations , 1991 .

[2] Christopher C. Paige,et al. Loss and Recapture of Orthogonality in the Modified Gram-Schmidt Algorithm , 1992, SIAM J. Matrix Anal. Appl..

[3] Dianne P. O'Leary,et al. Parallel QR factorization by Householder and modified Gram-Schmidt algorithms , 1990, Parallel Comput..

[4] Alan H. Karp,et al. Measuring parallel processor performance , 1990, CACM.

[5] Anoop Gupta,et al. Scaling parallel programs for multiprocessors: methodology and examples , 1993, Computer.

[6] K. A. Gallivan,et al. Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[7] Anant Agarwal,et al. Scalability of parallel machines , 1991, CACM.

[8] Robert A. van de Geijn,et al. Optimal Broadcasting in Mesh-Connected Architectures , 1991 .

[9] S. Lennart Johnsson,et al. Distributed Routing Algorithms for Broadcasting and Personalized Communication in Hypercubes , 1986, ICPP.

[10] S. Lennart Johnsson,et al. Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[11] Robert A. van de Geijn,et al. Scalability Issues Affecting the Design of a Dense Linear Algebra Library , 1994, J. Parallel Distributed Comput..

[12] Joël M. Malard,et al. Data Replication in Dense Matrix Factorization , 1993, Parallel Process. Lett..

[13] Xian-He Sun,et al. Toward a better parallel performance metric , 1991, Parallel Comput..

[14] Vipin Kumar,et al. The Scalability of FFT on Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..

[15] John L. Gustafson,et al. Reevaluating Amdahl's law , 1988, CACM.

[16] Patrick H. Worley,et al. The Effect of Time Constraints on Scaled Speedup , 1990, SIAM J. Sci. Comput..

[17] Jack Dongarra,et al. ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[18] Eric F. van de Velde,et al. Experiments with Multicomputer LU-decomposition , 1990, Concurr. Pract. Exp..

[19] C. Loan,et al. A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .

[20] Å. Björck. Numerics of Gram-Schmidt orthogonalization , 1994 .