Communication-Avoiding QR Decomposition for GPUs
暂无分享,去创建一个
James Demmel | Kurt Keutzer | Michael J. Anderson | Grey Ballard | J. Demmel | K. Keutzer | Grey Ballard
[1] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.
[2] D. Donoho,et al. Maximal Sparsity Representation via l 1 Minimization , 2002 .
[3] James Demmel,et al. Minimizing communication in sparse matrix solvers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[4] J. Demmel,et al. Implementing Communication-Optimal Parallel and Sequential QR Factorizations , 2008, 0809.2407.
[5] Jack Dongarra,et al. Enhancing Parallelism of Tile QR Factorization for Multicore Architectures , 2010 .
[6] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.
[7] Robert A. van de Geijn,et al. Solving dense linear systems on platforms with multiple hardware accelerators , 2009, PPoPP '09.
[8] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[9] Guilherme N. DeSouza,et al. Adaptive learning of multi-subspace for foreground detection under illumination changes , 2011, Comput. Vis. Image Underst..
[10] Jack Dongarra,et al. LAPACK Users' guide (third ed.) , 1999 .
[11] Eric J. Kelmelis,et al. CULA: hybrid GPU accelerated linear algebra routines , 2010, Defense + Commercial Sensing.
[12] Rita Cucchiara,et al. ViSOR: VIdeo Surveillance On-line Repository for annotation retrieval , 2008, 2008 IEEE International Conference on Multimedia and Expo.
[13] James Demmel,et al. Benchmarking GPUs to tune dense linear algebra , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[14] Tamara G. Kolda,et al. An overview of the Trilinos project , 2005, TOMS.
[15] Mark A. Richards,et al. QR decomposition on GPUs , 2009, GPGPU-2.
[16] Xiaoming Yuan,et al. Sparse and low-rank matrix decomposition via alternating direction method , 2013 .
[17] James Demmel,et al. LU, QR and Cholesky Factorizations using Vector Capabilities of GPUs , 2008 .
[18] James Demmel,et al. Communication-optimal Parallel and Sequential QR and LU Factorizations , 2008, SIAM J. Sci. Comput..
[19] Mark Hoemmen,et al. Communication-avoiding Krylov subspace methods , 2010 .
[20] Thomas Hérault,et al. QR factorization of tall and skinny matrices in a grid computing environment , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[21] James Demmel,et al. Applied Numerical Linear Algebra , 1997 .
[22] Jack Dongarra,et al. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .