Bounds for Heterogeneous Architectures
暂无分享,去创建一个
[1] H. Whitney,et al. An inequality related to the isoperimetric inequality , 1949 .
[2] V. Strassen. Gaussian elimination is not optimal , 1969 .
[3] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[4] Bharat Kumar,et al. A tensor product formulation of Strassen's matrix multiplication algorithm with memory reduction , 1995 .
[5] Matteo Frigo,et al. DAG-consistent distributed shared memory , 1996, Proceedings of International Conference on Parallel Processing.
[6] David S. Wise. Ahnentafel Indexing into Morton-Ordered Arrays, or Matrix Locality for Free , 2000, Euro-Par.
[7] J. Demmel,et al. An updated set of basic linear algebra subprograms (BLAS) , 2002, TOMS.
[8] Erik Elmroth,et al. SIAM REVIEW c ○ 2004 Society for Industrial and Applied Mathematics Vol. 46, No. 1, pp. 3–45 Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software ∗ , 2022 .
[9] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[10] Michael Bader,et al. Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves , 2007, PPAM.
[11] James Demmel,et al. Communication avoiding Gaussian elimination , 2008, HiPC 2008.
[12] James Demmel,et al. Minimizing communication in sparse matrix solvers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[13] James Demmel,et al. Communication-optimal Parallel and Sequential Cholesky Decomposition , 2009, SIAM J. Sci. Comput..
[14] James Demmel,et al. Brief announcement: communication bounds for heterogeneous architectures , 2011, SPAA '11.
[15] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[16] James Demmel,et al. Graph expansion and communication costs of fast matrix multiplication: regular submission , 2011, SPAA '11.
[17] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[18] James Demmel,et al. Communication-optimal Parallel and Sequential QR and LU Factorizations , 2008, SIAM J. Sci. Comput..
[19] James Demmel,et al. Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds , 2012, SPAA '12.
[20] James Demmel,et al. Communication-optimal parallel algorithm for strassen's matrix multiplication , 2012, SPAA '12.