Avoiding Communication in Successive Band Reduction
暂无分享,去创建一个
[1] Xiaobai Sun,et al. Parallel tridiagonalization through two-step band reduction , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[2] James Demmel,et al. Communication efficient gaussian elimination with partial pivoting using a shape morphing data layout , 2013, SPAA.
[3] Lukas Krämer,et al. Developing algorithms and software for the parallel solution of the symmetric eigenvalue problem , 2011, J. Comput. Sci..
[4] Jack J. Dongarra,et al. Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[5] Wilfried N. Gansterer,et al. Multi-sweep Algorithms for the Symmetric Eigenproblem , 1998, VECPAR.
[6] Sivasankaran Rajamanickam,et al. EFFICIENT ALGORITHMS FOR SPARSE SINGULAR VALUE DECOMPOSITION , 2009 .
[7] James Demmel,et al. Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers , 2008, SIAM J. Sci. Comput..
[8] H. Schwarz. Tridiagonalization of a symetric band matrix , 1968 .
[9] F. V. Zee. Restructuring the QR Algorithm for Performance , 2011 .
[10] James Demmel,et al. Communication avoiding successive band reduction , 2012, PPoPP '12.
[11] Bruno Lang,et al. Parallel Reduction of Banded Matrices to Bidiagonal Form , 1996, Parallel Comput..
[12] Enrique S. Quintana-Ortí,et al. Reduction to Condensed Forms for Symmetric Eigenvalue Problems on Multi-core Architectures , 2009, PPAM.
[13] D. Sorensen,et al. Block reduction of matrices to condensed forms for eigenvalue computations , 1990 .
[14] C. Loan,et al. A Storage-Efficient $WY$ Representation for Products of Householder Transformations , 1989 .
[15] Bruno Lang. Efficient eigenvalue and singular value computations on shared memory machines , 1999, Parallel Comput..
[16] K. Murata,et al. A New Method for the Tridiagonalization of the Symmetric Band Matrix , 1975 .
[17] Piotr Luszczek,et al. An improved parallel singular value algorithm and its implementation for multicore hardware , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Jack J. Dongarra,et al. Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem , 2012, SIAM J. Sci. Comput..
[19] James Demmel,et al. Communication-optimal Parallel and Sequential QR and LU Factorizations , 2008, SIAM J. Sci. Comput..
[20] Lukas Krämer,et al. Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations , 2011, Parallel Comput..
[21] J. H. Wilkinson. Calculation of the eigenvalues of a symmetric tridiagonal matrix by the method of bisection , 1962 .
[22] Bruno Lang,et al. Efficient parallel reduction to bidiagonal form , 1999, Parallel Comput..
[23] Jack J. Dongarra,et al. High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures , 2013, TOMS.
[24] B GibbonsPhillip. ACM transactions on parallel computing , 2014 .
[25] Karen S. Braman,et al. The Multishift QR Algorithm. Part I: Maintaining Well-Focused Shifts and Level 3 Performance , 2001, SIAM J. Matrix Anal. Appl..
[26] Bruno Lang,et al. A Parallel Algorithm for Reducing Symmetric Banded Matrices to Tridiagonal Form , 1993, SIAM J. Sci. Comput..
[27] Jack J. Dongarra,et al. A novel hybrid CPU–GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[28] Thomas Auckenthaler,et al. Highly scalable eigensolvers for petaflop applications , 2012 .
[29] Jack J. Dongarra,et al. Two-Stage Tridiagonal Reduction for Dense Symmetric Matrices Using Tile Algorithms on Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[30] Christian H. Bischof,et al. Algorithm 807: The SBR Toolbox—software for successive band reduction , 2000, TOMS.
[31] B. Parlett,et al. Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices , 2004 .
[32] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[33] Jehoshua Bruck,et al. Efficient algorithms for all-to-all communications in multi-port message-passing systems , 1994, SPAA '94.
[34] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[35] Lynn Elliot Cannon,et al. A cellular computer to implement the kalman filter algorithm , 1969 .
[36] James Demmel,et al. Reconstructing Householder Vectors from Tall-Skinny QR , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[37] Lars Karlsson,et al. Parallel two-stage reduction to Hessenberg form using dynamic scheduling on shared-memory architectures , 2011, Parallel Comput..
[38] Linda Kaufman,et al. Banded Eigenvalue Solvers on Vector Machines , 1984, TOMS.
[39] Linda Kaufman. Band reduction algorithms revisited , 2000, TOMS.
[40] James Demmel,et al. Cache efficient bidiagonalization using BLAS 2.5 operators , 2008, TOMS.
[41] Daniel Kressner,et al. A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems , 2010, SIAM J. Sci. Comput..
[42] Smith,et al. A Parallel Algorithm for Householder TridiagonalizationChristopher , 1994 .
[43] James Demmel,et al. Minimizing Communication in Numerical Linear Algebra , 2009, SIAM J. Matrix Anal. Appl..
[44] James Hardy Wilkinson,et al. The QR and QL Algorithms for Symmetric Matrices , 1971 .
[45] James Hardy Wilkinson,et al. Householder's method for symmetric matrices , 1962 .
[46] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.
[47] J. Cuppen. A divide and conquer method for the symmetric tridiagonal eigenproblem , 1980 .
[48] Jack J. Dongarra,et al. A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[49] Jack J. Dongarra,et al. Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[50] H. Rutishauser. On jacobi rotation patterns , 1963 .
[51] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .
[52] Samuel H. Fuller,et al. Computing Performance: Game Over or Next Level? , 2011, Computer.
[53] Robert A. van de Geijn,et al. Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance , 2014, ACM Trans. Math. Softw..
[54] Jack J. Dongarra,et al. A novel hybrid CPU–GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks , 2014, Int. J. High Perform. Comput. Appl..
[55] J. H. Wilkinson,et al. TheQR andQL algorithms for symmetric matrices , 1968 .
[56] Christian H. Bischof,et al. A framework for symmetric band reduction , 2000, TOMS.
[57] Christian H. Bischof,et al. Parallel Bandreduction and Tridiagonalization , 1993, PPSC.
[58] Samuel H. Fuller,et al. The Future of Computing Performance: Game Over or Next Level? , 2014 .
[59] C. H. Bischof,et al. A framework for symmetric band reduction and tridiagonalization , 1994 .