Scheduling Two-sided Transformations using Algorithms-by-Tiles on Multicore Architectures LAPACK Working Note # 214
暂无分享,去创建一个
[1] John A. Gunnels,et al. Minimal Data Copy for Dense Linear Algebra Factorization , 2006, PARA.
[2] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, PARA.
[3] Robert A. van de Geijn,et al. Updating an LU Factorization with Pivoting , 2008, TOMS.
[4] Jack J. Dongarra,et al. Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead , 2006, PARA.
[5] Jack J. Dongarra,et al. Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization , 2008, IEEE Transactions on Parallel and Distributed Systems.
[6] DongarraJack,et al. Parallel tiled QR factorization for multicore architectures , 2008 .
[7] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[8] Viktor K. Prasanna,et al. Tiling, Block Data Layout, and Memory Hierarchy Performance , 2003, IEEE Trans. Parallel Distributed Syst..
[9] Jack Dongarra,et al. LAPACK Users' Guide, 3rd ed. , 1999 .
[10] G. W. Stewart,et al. Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .
[11] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.
[12] Julien Langou,et al. Parallel tiled QR factorization for multicore architectures , 2007, Concurr. Comput. Pract. Exp..
[13] C. Loan,et al. A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .
[14] Markus Hegland,et al. A Parallel Algorithm for the Reduction to Tridiagonal Form for Eigendecomposition , 1999, SIAM J. Sci. Comput..
[15] Matemática,et al. Society for Industrial and Applied Mathematics , 2010 .
[16] Jack Dongarra,et al. QR Factorization for the CELL Processor , 2008 .
[17] C. Danforth,et al. Estimating and Correcting Global Weather Model Error , 2007 .
[18] Erik Elmroth,et al. SIAM REVIEW c ○ 2004 Society for Industrial and Applied Mathematics Vol. 46, No. 1, pp. 3–45 Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software ∗ , 2022 .
[19] Ramesh C. Agarwal,et al. Vector and parallel algorithms for Cholesky factorization on IBM 3090 , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[20] Taher H. Haveliwala,et al. The Second Eigenvalue of the Google Matrix , 2003 .
[21] Erik Elmroth,et al. High-Performance Library Software for QR Factorization , 2000, PARA.
[22] Jesús Labarta,et al. CellSs: Making it easier to program the Cell Broadband Engine processor , 2007, IBM J. Res. Dev..
[23] Philipp Birken,et al. Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.
[24] Erik Elmroth,et al. Applying recursion to serial and parallel QR factorization leads to better performance , 2000, IBM J. Res. Dev..
[25] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[26] Robert A. van de Geijn,et al. Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).
[27] Rosa M. Badia,et al. CellSs: a Programming Model for the Cell BE Architecture , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[28] Fred G. Gustavson,et al. New Generalized Matrix Data Structures Lead to a Variety of High-Performance Algorithms , 2000, The Architecture of Scientific Software.
[29] Robert A. van de Geijn,et al. Parallel out-of-core computation and updating of the QR factorization , 2005, TOMS.
[30] E. L. Yip,et al. FORTRAN subroutines for out-of-core solutions of large complex linear systems , 1979 .
[31] Jesse L. Barlow,et al. Block and Parallel Versions of One-Sided Bidiagonalization , 2007, SIAM J. Matrix Anal. Appl..
[32] Viktor K. Prasanna,et al. Analysis of memory hierarchy performance of block data layout , 2002, Proceedings International Conference on Parallel Processing.
[33] Julien Langou,et al. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..
[34] Jack Dongarra,et al. Multithreading for synchronization tolerance in matrix factorization , 2007 .
[35] Z. Drmač,et al. A new stable bidiagonal reduction algorithm , 2005 .
[36] Erik Elmroth,et al. New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems , 1998, PARA.
[37] S. P. Kumar,et al. Solving Linear Algebraic Equations on an MIMD Computer , 1983, JACM.
[38] Jack Dongarra,et al. Parallel Tiled QR Factorization for Multicore Architectures LAPACK Working Note # 190 , 2007 .