Perfect Strong Scaling Using No Additional Energy
暂无分享,去创建一个
James Demmel | Oded Schwartz | Benjamin Lipshitz | Andrew Gearhart | J. Demmel | O. Schwartz | A. Gearhart | Benjamin Lipshitz
[1] Ramesh C. Agarwal,et al. A three-dimensional approach to parallel matrix multiplication , 1995, IBM J. Res. Dev..
[2] Margaret Martonosi,et al. Computer Architecture Techniques for Power-Efficiency , 2008, Computer Architecture Techniques for Power-Efficiency.
[3] James Demmel,et al. Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds , 2012, SPAA '12.
[4] Lynn Elliot Cannon,et al. A cellular computer to implement the kalman filter algorithm , 1969 .
[5] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[6] James Demmel,et al. Brief announcement: communication bounds for heterogeneous architectures , 2011, SPAA '11.
[7] Gianfranco Bilardi,et al. A Lower Bound Technique for Communication on BSP with Application to the FFT , 2012, Euro-Par.
[8] James Demmel,et al. Communication-optimal Parallel and Sequential QR and LU Factorizations , 2008, SIAM J. Sci. Comput..
[9] Katherine A. Yelick,et al. A Communication-Optimal N-Body Algorithm for Direct Interactions , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[10] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[11] Liliana Heer. Neon , 2007 .
[12] J. Demmel,et al. Implementing Communication-Optimal Parallel and Sequential QR Factorizations , 2008, 0809.2407.
[13] Rajesh Gupta,et al. Evaluating the effectiveness of model-based power characterization , 2011 .
[14] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[15] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .
[16] Robert A. van de Geijn,et al. SUMMA: scalable universal matrix multiplication algorithm , 1995, Concurr. Pract. Exp..
[17] James Demmel,et al. Improving communication performance in dense linear algebra via topology aware collectives , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Robert A. van de Geijn,et al. SUMMA: Scalable Universal Matrix Multiplication Algorithm , 1995 .
[19] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[20] James Demmel,et al. Communication-optimal parallel algorithm for strassen's matrix multiplication , 2012, SPAA '12.