A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting
暂无分享,去创建一个
Enrique S. Quintana-Ortí | Sandra Catalán | Rafael Rodríguez-Sánchez | José R. Herrero | Robert Van De Geijn | E. S. Quintana-Ortí | R. van de Geijn | E. S. Quintana‐Ortí | Sandra Catalán | J. Herrero | Rafael Rodríguez-Sánchez | E. Quintana‐Ortí | Robert A. van de Geijn
[1] P. Strazdins. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization , 1998 .
[2] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[3] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[4] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[5] Jack J. Dongarra,et al. An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.
[6] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.
[7] Robert A. van de Geijn,et al. FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.
[8] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.
[9] Robert A. van de Geijn,et al. BLIS: A Framework for Rapidly Instantiating BLAS Functionality , 2015, ACM Trans. Math. Softw..
[10] Robert A. van de Geijn,et al. High-performance implementation of the level-3 BLAS , 2008, TOMS.
[11] Golub Gene H. Et.Al. Matrix Computations, 3rd Edition , 2007 .
[12] Rafael Mayo,et al. Architecture-aware configuration and scheduling of matrix multiplication on asymmetric multicore processors , 2015, Cluster Computing.
[13] Tze Meng Low,et al. The BLIS Framework , 2016 .
[14] Robert A. van de Geijn,et al. Anatomy of High-Performance Many-Threaded Matrix Multiplication , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[15] Enrique S. Quintana-Ortí,et al. Static Versus Dynamic Task Scheduling of the Lu Factorization on ARM big. LITTLE Architectures , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[16] Jack Dongarra,et al. LAPACK Users' Guide, 3rd ed. , 1999 .