Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms
暂无分享,去创建一个
[1] Ramesh C. Agarwal,et al. A three-dimensional approach to parallel matrix multiplication , 1995, IBM J. Res. Dev..
[2] Sartaj Sahni,et al. Parallel Matrix and Graph Algorithms , 1981, SIAM J. Comput..
[3] James Demmel,et al. Communication avoiding Gaussian elimination , 2008, HiPC 2008.
[4] Cleve Ashc Raft. The fan-both family of column-based distributed Cholesky factorization algorithms , 1993 .
[5] Alok Aggarwal,et al. Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..
[6] William Gropp,et al. Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .
[7] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[8] Philip Heidelberger,et al. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.
[9] Alexander Tiskin,et al. Memory-Efficient Matrix Multiplication in the BSP Model , 1999, Algorithmica.
[10] Patricia J. Teller,et al. Proceedings of the 2008 ACM/IEEE conference on Supercomputing , 2008, HiPC 2008.
[11] Lynn Elliot Cannon,et al. A cellular computer to implement the kalman filter algorithm , 1969 .
[12] A. George,et al. Graph theory and sparse matrix computation , 1993 .
[13] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[14] Dror Irony,et al. TRADING REPLICATION FOR COMMUNICATION IN PARALLEL DISTRIBUTED-MEMORY DENSE SOLVERS , 2002 .
[15] ToledoSivan,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004 .
[16] Jack Dongarra,et al. ScaLAPACK user's guide , 1997 .
[17] Dror Irony,et al. Trading Replication for Communication in Parallel Distributed-Memory Dense Solvers , 2002, Parallel Process. Lett..
[18] James Demmel,et al. CALU: A Communication Optimal LU Factorization Algorithm , 2011, SIAM J. Matrix Anal. Appl..
[19] James Demmel,et al. Fast linear algebra is stable , 2006, Numerische Mathematik.
[20] Amith R. Mamidala,et al. MPI Collective Communications on The Blue Gene/P Supercomputer: Algorithms and Optimizations , 2009, Hot Interconnects.
[21] S. Lennart Johnsson,et al. Minimizing the Communication Time for Matrix Multiplication on Multiprocessors , 1993, Parallel Comput..