Communication-optimal parallel algorithm for strassen's matrix multiplication
暂无分享,去创建一个
James Demmel | Oded Schwartz | Grey Ballard | Olga Holtz | Benjamin Lipshitz | J. Demmel | Grey Ballard | O. Schwartz | Olga Holtz | Benjamin Lipshitz
[1] Qingshan Luo,et al. A scalable parallel Strassen's matrix multiplication algorithm for distributed-memory computers , 1995, SAC '95.
[2] Dario Bini. Relations between exact and approximate bilinear algorithms. Applications , 1980 .
[3] Frédéric Suter,et al. Impact of mixed‐parallelism on parallel implementations of the Strassen and Winograd matrix multiplication algorithms , 2004, Concurr. Pract. Exp..
[4] Robert L. Probert. On the Additive Complexity of Matrix Multiplication , 1976, SIAM J. Comput..
[5] Robert A. van de Geijn,et al. A High Performance Parallel Strassen Implementation , 1995, Parallel Process. Lett..
[6] Ramesh C. Agarwal,et al. A three-dimensional approach to parallel matrix multiplication , 1995, IBM J. Res. Dev..
[7] Nader H. Bshouty,et al. On the Additive Complexity of 2 x 2 Matrix Multiplication , 1995, Inf. Process. Lett..
[8] Jaeyoung Choi,et al. Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers , 1994, Concurr. Pract. Exp..
[9] Telecommunications Board. The Future of Computing Performance: Game Over or Next Level? , 2011 .
[10] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[11] Bharat Kumar,et al. A tensor product formulation of Strassen's matrix multiplication algorithm with memory reduction , 1995 .
[12] DemmelJames,et al. Graph expansion and communication costs of fast matrix multiplication , 2013 .
[13] Mei Han An,et al. accuracy and stability of numerical algorithms , 1991 .
[14] Jarle Berntsen,et al. Communication efficient matrix multiplication on hypercubes , 1989, Parallel Comput..
[15] Thomas Rauber,et al. Combining building blocks for parallel multi-level matrix multiplication , 2008, Parallel Comput..
[16] Alok Aggarwal,et al. Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..
[17] Lynn Elliot Cannon,et al. A cellular computer to implement the kalman filter algorithm , 1969 .
[18] Victor Y. Pan,et al. New Fast Algorithms for Matrix Operations , 1980, SIAM J. Comput..
[19] Christopher Umans. Group-theoretic algorithms for matrix multiplication , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).
[20] Jaeyoung Choi. A new parallel matrix multiplication algorithm on distributed-memory concurrent computers , 1998, Concurr. Pract. Exp..
[21] James Demmel,et al. Graph expansion and communication costs of fast matrix multiplication: regular submission , 2011, SPAA '11.
[22] Barton P. Miller,et al. Critical path analysis for the execution of parallel and distributed programs , 1988, [1988] Proceedings. The 8th International Conference on Distributed.
[23] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .
[24] Arnold Schönhage,et al. Partial and Total Matrix Multiplication , 1981, SIAM J. Comput..
[25] Jack Dongarra,et al. Experiments with Strassen's Algorithm: From Sequential to Parallel , 2006 .
[26] V. Strassen. Gaussian elimination is not optimal , 1969 .
[27] Telecommunications Board,et al. Getting Up to Speed: The Future of Supercomputing , 2005 .
[28] James Demmel,et al. Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds , 2012, SPAA '12.
[29] James Demmel,et al. Fast linear algebra is stable , 2006, Numerische Mathematik.
[30] Ran Raz,et al. On the complexity of matrix product , 2002, STOC '02.
[31] Don Coppersmith,et al. Matrix multiplication via arithmetic progressions , 1987, STOC.
[32] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[33] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[34] Don Coppersmith,et al. On the Asymptotic Complexity of Matrix Multiplication , 1982, SIAM J. Comput..
[35] V. Strassen. Relative bilinear complexity and matrix multiplication. , 1987 .
[36] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[37] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[38] Shmuel Winograd,et al. On multiplication of 2 × 2 matrices , 1971 .
[39] Frédéric Suter,et al. Impact of mixed-parallelism on parallel implementations of the Strassen and Winograd matrix multiplication algorithms: Research Articles , 2004 .
[40] Francesco Romani,et al. Some Properties of Disjoint Sums of Tensors Related to Matrix Multiplication , 1982, SIAM J. Comput..
[41] James Demmel,et al. Fast matrix multiplication is stable , 2006, Numerische Mathematik.
[42] Samuel H. Fuller,et al. The Future of Computing Performance: Game Over or Next Level? , 2014 .