A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems
暂无分享,去创建一个
Richard W. Vuduc | Xiaoye S. Li | Piyush Sao | Xing Liu | R. Vuduc | Xing Liu | Piyush Sao | X. Li
[1] Chenhan D. Yu,et al. A CPU-GPU hybrid approach for the unsymmetric multifrontal method , 2011, Parallel Comput..
[2] Pradeep Dubey,et al. Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[3] George Bosilca,et al. Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[4] Roger Grimes,et al. Multifrontal Computations on GPUs and Their Multi-core Hosts , 2010, VECPAR.
[5] John K. Reid,et al. The Multifrontal Solution of Indefinite Sparse Symmetric Linear , 1983, TOMS.
[6] I. Duff,et al. Direct Methods for Sparse Matrices , 1987 .
[7] Anamitra R. Choudhury,et al. Multifrontal Factorization of Sparse SPD Matrices on GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[8] Xing Liu,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.
[9] FengWu-chun,et al. The Green500 List , 2007 .
[10] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[11] Pradeep Dubey,et al. Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] Murat Efe Guney,et al. On the limits of GPU acceleration , 2010 .
[13] Pradeep Dubey,et al. Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[14] Jack J. Dongarra,et al. A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators , 2010, VECPAR.
[15] Dinesh Manocha,et al. LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[16] Iain S. Duff,et al. Direct methods for sparse matrices27100 , 1986 .
[17] Victor Eijkhout,et al. Scheduling a Parallel Sparse Direct Solver to Multiple GPUs , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[18] Eitan Grinspun,et al. Sparse matrix solvers on the GPU: conjugate gradients and multigrid , 2003, SIGGRAPH Courses.
[19] Richard W. Vuduc,et al. A Distributed CPU-GPU Sparse Direct Solver , 2014, Euro-Par.
[20] Gene Poole,et al. Accelerating the ANSYS Direct Sparse Solver with GPUs , 2011 .
[21] James Demmel,et al. Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms , 2011, Euro-Par.
[22] James Demmel,et al. SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems , 2003, TOMS.