High-performance implementation of the level-3 BLAS
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[2] Ed Anderson,et al. LAPACK Users' Guide , 1995 .
[3] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[4] Bo Kågström,et al. GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark , 1998, TOMS.
[5] Jack Dongarra,et al. LAPACK Users' guide (third ed.) , 1999 .
[6] Robert A. van de Geijn,et al. FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.
[7] Erik Elmroth,et al. SIAM REVIEW c ○ 2004 Society for Industrial and Applied Mathematics Vol. 46, No. 1, pp. 3–45 Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software ∗ , 2022 .
[8] Robert A. van de Geijn,et al. Toward Scalable Matrix Multiply on Multithreaded Architectures , 2007, Euro-Par.
[9] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.