Code Generation and Optimization of Distributed-Memory Dense Linear Algebra Kernels
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[2] Robert A. van de Geijn,et al. Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..
[3] Robert A. van de Geijn,et al. Designing Linear Algebra Algorithms by Transformation: Mechanizing the Expert Developer , 2012, VECPAR.
[4] Jack Dongarra,et al. LAPACK Users' guide (third ed.) , 1999 .
[5] Mary Shaw,et al. Software architecture - perspectives on an emerging discipline , 1996 .
[6] Robert A. van de Geijn,et al. SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks , 2008, PPoPP.
[7] Robert A. van de Geijn,et al. Mechanical derivation and systematic analysis of correct linear algebra algorithms , 2006 .
[8] Bo Kågström,et al. GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark , 1998, TOMS.
[9] Ed Anderson,et al. LAPACK Users' Guide , 1995 .
[10] Robert A. van de Geijn,et al. Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.
[11] Robert A. van de Geijn,et al. FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.