暂无分享,去创建一个
James Demmel | Oded Schwartz | Grey Ballard | Olga Holtz | J. Demmel | Grey Ballard | O. Schwartz | Olga Holtz
[1] Isak Jonsson,et al. High Performance Cholesky Factorization via Blocking and Recursion That Uses Minimal Storage , 2000, PARA.
[2] Barton P. Miller,et al. Critical path analysis for the execution of parallel and distributed programs , 1988, [1988] Proceedings. The 8th International Conference on Distributed.
[3] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .
[4] David S. Wise. Ahnentafel Indexing into Morton-Ordered Arrays, or Matrix Locality for Free , 2000, Euro-Par.
[5] Charles E. Leiserson,et al. Cache-Oblivious Algorithms , 2003, CIAC.
[6] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[7] Michael A. Bender,et al. Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model , 2007, SPAA '07.
[8] James Demmel,et al. Communication Avoiding Gaussian elimination , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[9] James Demmel,et al. Communication-optimal Parallel and Sequential QR and LU Factorizations , 2008, SIAM J. Sci. Comput..
[10] Natacha Béreux. Out-of-Core Implementations of Cholesky Factorization: Loop-Based versus Recursive Algorithms , 2008, SIAM J. Matrix Anal. Appl..
[11] Fred G. Gustavson,et al. Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..
[12] Vijaya Ramachandran,et al. Cache-oblivious dynamic programming , 2006, SODA '06.
[13] Sivan Toledo. Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..
[14] Fred G. Gustavson,et al. A recursive formulation of Cholesky factorization of a matrix in packed storage , 2001, TOMS.
[15] Oded Schwartz,et al. Communication-optimal parallel and sequential Cholesky decomposition: extended abstract , 2009, SPAA.
[16] Marc Snir,et al. GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING , 2004 .
[17] P. Tvrdík,et al. Analytical model for analysis of cache behavior during cholesky factorization and its variants , 2004, Workshops on Mobile and Wireless Networking/High Performance Scientific, Engineering Computing/Network Design and Architecture/Optical Networks Control and Management/Ad Hoc and Sensor Networks/Compil.
[18] J. Demmel,et al. Implementing Communication-Optimal Parallel and Sequential QR Factorizations , 2008, 0809.2407.
[19] James Demmel,et al. Communication avoiding Gaussian elimination , 2008, HiPC 2008.
[20] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[21] Jack Dongarra,et al. LAPACK's user's guide , 1992 .
[22] John E. Savage. Extending the Hong-Kung Model to Memory Hierarchies , 1995, COCOON.
[23] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[24] Keshav Pingali,et al. Automatic Generation of Block-Recursive Codes , 2000, Euro-Par.
[25] Dror Irony,et al. Communication-Efficient Parallel Dense LU Using a3-Dimnsional Approach , 2001, PPSC.
[26] Michael Bader,et al. Hardware-Oriented Implementation of Cache Oblivious Matrix Operations Based on Space-Filling Curves , 2007, PPAM.
[27] James Demmel,et al. Minimizing Communication in Linear Algebra , 2009, ArXiv.
[28] Erik Elmroth,et al. SIAM REVIEW c ○ 2004 Society for Industrial and Applied Mathematics Vol. 46, No. 1, pp. 3–45 Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software ∗ , 2022 .
[29] Nicholas J. Higham,et al. INVERSE PROBLEMS NEWSLETTER , 1991 .
[30] James Demmel,et al. IEEE Standard for Floating-Point Arithmetic , 2008 .