Contention Bounds for Combinations of Computation Graphs and Network Topologies.
暂无分享,去创建一个
James Demmel | Sivan Toledo | Oded Schwartz | Grey Ballard | Benjamin Lipshitz | Andrew Gearhart | J. Demmel | Sivan Toledo | Grey Ballard | O. Schwartz | A. Gearhart | Benjamin Lipshitz
[1] H. Whitney,et al. An inequality related to the isoperimetric inequality , 1949 .
[2] V. Strassen. Gaussian elimination is not optimal , 1969 .
[3] Alok Aggarwal,et al. Communication Complexity of PRAMs , 1990, Theor. Comput. Sci..
[4] Charles E. Leiserson,et al. Randomized Routing on Fat-Trees , 1989, Adv. Comput. Res..
[5] James Demmel,et al. Exploiting Data Sparsity in Parallel Matrix Powers Computations , 2013, PPAM.
[6] Viktor K. Prasanna,et al. Optimizing graph algorithms for improved cache performance , 2002, IEEE Transactions on Parallel and Distributed Systems.
[7] James Demmel,et al. Communication lower bounds and optimal algorithms for programs that reference arrays - Part 1 , 2013, ArXiv.
[8] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[9] James Demmel,et al. Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations , 2014, SPAA.
[10] Michele Scquizzato,et al. Communication Lower Bounds for Distributed-Memory Computations , 2013, STACS.
[11] Arnold Schönhage,et al. Partial and Total Matrix Multiplication , 1981, SIAM J. Comput..
[12] Gianfranco Bilardi,et al. Deterministic on-line routing on area-universal networks , 1995, JACM.
[13] James Demmel,et al. Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication , 2012, MedAlg.
[14] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[15] Grey Ballard,et al. Avoiding Communication in Dense Linear Algebra , 2013 .
[16] P. Heidelberger,et al. The IBM Blue Gene/Q Interconnection Fabric , 2012, IEEE Micro.
[17] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[18] Virginia Vassilevska Williams,et al. Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.
[19] Béla Bollobás,et al. Edge-isoperimetric inequalities in the grid , 1991, Comb..
[20] V. Strassen. Relative bilinear complexity and matrix multiplication. , 1987 .
[21] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[22] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .
[23] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[24] P. Sadayappan,et al. Communication-Efficient Matrix Multiplication on Hypercubes , 1996, Parallel Comput..
[25] James Demmel,et al. Communication optimal parallel multiplication of sparse random matrices , 2013, SPAA.
[26] J. Demmel,et al. Tradeoffs between synchronization , communication , and work in parallel linear algebra computations , 2014 .
[27] Philip Heidelberger,et al. Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..
[28] James Demmel,et al. Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds , 2012, SPAA '12.
[29] Gianfranco Bilardi,et al. A Lower Bound Technique for Communication on BSP with Application to the FFT , 2012, Euro-Par.
[30] Michael T. Goodrich,et al. Communication-Efficient Parallel Sorting , 1999, SIAM J. Comput..
[31] Lynn Elliot Cannon,et al. A cellular computer to implement the kalman filter algorithm , 1969 .
[32] Toshiyuki Shimizu,et al. Tofu: A 6D Mesh/Torus Interconnect for Exascale Computers , 2009, Computer.
[33] R. J. Joenk,et al. IBM journal of research and development: information for authors , 1978 .
[34] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .
[35] Alexander Tiskin,et al. Memory-Efficient Matrix Multiplication in the BSP Model , 1999, Algorithmica.
[36] Katherine A. Yelick,et al. A Communication-Optimal N-Body Algorithm for Direct Interactions , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[37] Robert A. van de Geijn,et al. Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..
[38] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..