Performance Modeling and Optimal Block Size Selection for the Small-Bulge Multishift QR Algorithm
暂无分享,去创建一个
[1] Victor Eijkhout,et al. Self-adapting numerical software (SANS) effort , 2006, IBM J. Res. Dev..
[2] Jack J. Dongarra,et al. A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures , 2002, SIAM J. Sci. Comput..
[3] Krister Dackland,et al. A Hierarchical Approach for Performance Analysis of ScaLAPACK-Based Routines Using the Distributed Linear Algebra Machine , 1996, PARA.
[4] David S. Watkins,et al. Shifting Strategies for the Parallel QR Algorithm , 1994, SIAM J. Sci. Comput..
[5] James Demmel,et al. On a Block Implementation of Hessenberg Multishift QR Iteration , 1989, Int. J. High Speed Comput..
[6] Daniel Kressner,et al. Numerical Methods for General and Structured Eigenvalue Problems , 2005, Lecture Notes in Computational Science and Engineering.
[7] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[8] V. Kublanovskaya. On some algorithms for the solution of the complete eigenvalue problem , 1962 .
[9] Y. Kanada,et al. A Methodology for Automatically Tuned Parallel Tridiagonalization on Distributed Memory Vector-parallel Machines , 2000 .
[10] David S. Watkins,et al. The transmission of shifts and shift blurring in the QR algorithm , 1996 .
[11] J. G. F. Francis,et al. The QR Transformation - Part 2 , 1962, Comput. J..
[12] David S. Watkins. Bidirectional chasing algorithms for the eigenvalue problem , 1993 .
[13] J. G. F. Francis,et al. The QR Transformation A Unitary Analogue to the LR Transformation - Part 1 , 1961, Comput. J..
[14] Victor Eijkhout,et al. Self-Adapting Numerical Software for Next Generation Applications , 2003, Int. J. High Perform. Comput. Appl..
[15] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[16] Karen S. Braman,et al. The Multishift QR Algorithm. Part I: Maintaining Well-Focused Shifts and Level 3 Performance , 2001, SIAM J. Matrix Anal. Appl..
[17] Javier Cuenca,et al. Architecture of an automatically tuned linear algebra library , 2004, Parallel Comput..
[18] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[19] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[20] Javier Cuenca,et al. Empirical Modelling of Parallel Linear Algebra Routines , 2003, PPAM.
[21] Y. Yamamoto,et al. Performance modeling and optimal block size selection for a BLAS-3 based tridiagonalization algorithm , 2005, Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05).
[22] James Demmel,et al. Applied Numerical Linear Algebra , 1997 .