Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves
暂无分享,去创建一个
Michael Bader | Alexander Heinecke | Stephan Günther | Stephan M. Günther | Robert Franz | A. Heinecke | M. Bader | R. Franz
[1] Michael Bader,et al. Cache Oblivious Matrix Operations Using Peano Curves , 2006, PARA.
[2] Iain S. Duff,et al. The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices , 1999, SIAM J. Matrix Anal. Appl..
[3] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[4] R. W. Hamming. State of the art in scientific computing , 1963, AFIPS '63 (Spring).
[5] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[6] Michael Bader,et al. A Cache Oblivious Algorithm for Matrix Multiplication Based on Peano's Space Filling Curve , 2005, PPAM.
[7] Michael Bader,et al. Cache oblivious matrix multiplication using an element ordering based on the Peano curve , 2006 .
[8] Fred G. Gustavson,et al. Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..
[9] Douglas Aberdeen,et al. Emmerald: a fast matrix–matrix multiply using Intel's SSE instructions , 2001, Concurr. Comput. Pract. Exp..
[10] Erik Elmroth,et al. SIAM REVIEW c ○ 2004 Society for Industrial and Applied Mathematics Vol. 46, No. 1, pp. 3–45 Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software ∗ , 2022 .