A High Performance QDWH-SVD Solver Using Hardware Accelerators
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[2] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.
[3] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[4] Jack J. Dongarra,et al. Solving the Generalized Symmetric Eigenvalue Problem using Tile Algorithms on Multicore Architectures , 2011, PARCO.
[5] L. Trefethen,et al. Numerical linear algebra , 1997 .
[6] P. Hansen. Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion , 1987 .
[7] Jack J. Dongarra,et al. A novel hybrid CPU–GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[8] B. Parlett,et al. Block reflectors: theory and computation , 1988 .
[9] Stanley C. Eisenstat,et al. A Divide-and-Conquer Algorithm for the Bidiagonal SVD , 1995, SIAM J. Matrix Anal. Appl..
[10] James Demmel,et al. Accurate Singular Values of Bidiagonal Matrices , 1990, SIAM J. Sci. Comput..
[11] Jack J. Dongarra,et al. Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures , 2010, IEEE Transactions on Parallel and Distributed Systems.
[12] Nicholas J. Higham,et al. Parallel Singular Value Decomposition via the Polar Decomposition , 2006 .
[13] P. Schönemann,et al. A generalized solution of the orthogonal procrustes problem , 1966 .
[14] James Demmel,et al. Minimizing Communication for Eigenproblems and the Singular Value Decomposition , 2010, ArXiv.
[15] Jerome A. Goldstein,et al. Linear algebra and quantum chemistry , 1991 .
[16] B. Parlett,et al. Accurate singular values and differential qd algorithms , 1994 .
[17] Jack Dongarra,et al. QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .
[18] Nicholas J. Higham,et al. Stable and Efficient Spectral Divide and Conquer Algorithms for the Symmetric Eigenvalue Decomposition and the SVD , 2013, SIAM J. Sci. Comput..
[19] Piotr Luszczek,et al. An improved parallel singular value algorithm and its implementation for multicore hardware , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[20] Jack Dongarra,et al. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .
[21] Wilfred Pinfold,et al. Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , 2009, HiPC 2009.
[22] Gene H. Golub,et al. Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.
[23] W. Kahan,et al. Computing small singular values of bidiagonal matrices with guaranteed high relative accuracy: LAPACK working note number 3 , 1988 .
[24] Jack J. Dongarra,et al. A novel hybrid CPU–GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks , 2014, Int. J. High Perform. Comput. Appl..
[25] Itzhack Bar-itzhack,et al. Iterative Optimal Orthogonalization of the Strapdown Matrix , 1975, IEEE Transactions on Aerospace and Electronic Systems.
[26] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[27] Bruno Lang. Efficient eigenvalue and singular value computations on shared memory machines , 1999, Parallel Comput..
[28] K. S. Arun,et al. A Unitarily Constrained Total Least Squares Problem in Signal Processing , 1992, SIAM J. Matrix Anal. Appl..
[29] Zhaojun Bai,et al. Optimizing Halley's Iteration for Computing the Matrix Polar Decomposition , 2010, SIAM J. Matrix Anal. Appl..
[30] Jack J. Dongarra,et al. Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[31] Christian H. Bischof,et al. Algorithm 807: The SBR Toolbox—software for successive band reduction , 2000, TOMS.
[32] Jack Dongarra,et al. Parallel Band Two-Sided MatrixBidiagonalization for Multicore Architectures , 2009 .
[33] Philipp Birken,et al. Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.