Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs
暂无分享,去创建一个
Nachiket Kapre | George A. Constantinides | Abid Rafique | G. Constantinides | N. Kapre | A. Rafique
[1] James Demmel,et al. Minimizing communication in sparse matrix solvers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[2] Larry Carter,et al. Sparse Tiling for Stationary Iterative Methods , 2004, Int. J. High Perform. Comput. Appl..
[3] Wim Vanderbauwhede,et al. High-Performance Computing Using FPGAs , 2013 .
[4] Etienne de Klerk,et al. Exploiting special structure in semidefinite programming: A survey of theory and applications , 2010, Eur. J. Oper. Res..
[5] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[6] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.
[7] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Kurt Keutzer,et al. A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[9] Anthony T. Chronopoulos,et al. A class of Lanczos-like algorithms implemented on parallel computers , 1991, Parallel Comput..
[10] Stephen J. Wright,et al. Application of Interior-Point Methods to Model Predictive Control , 1998 .
[11] emontmej,et al. High Performance Computing , 2003, Lecture Notes in Computer Science.
[12] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .
[13] A. Mercer. Numerical Solution of Ordinary and Partial Differential Equations , 1963 .
[14] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[15] Sivan Toledo,et al. Quantitative performance modeling of scientific computations and creating locality in numerical algorithms , 1995 .
[16] Hemalatha,et al. High-Performance Computing using GPUs , 2013 .
[17] Granville Sewell,et al. The numerical solution of ordinary and partial differential equations , 2005 .
[18] Y. Danieli. Guide , 2005 .
[19] Marc Snir,et al. GETTING UP TO SPEED THE FUTURE OF SUPERCOMPUTING , 2004 .
[20] James Demmel,et al. Avoiding communication in sparse matrix computations , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[21] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.
[22] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.