Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms
暂无分享,去创建一个
Weixing Ji | Feng Shi | Akrem Benatia | Yizhuo Wang | Feng Shi | Yizhuo Wang | Weixing Ji | Akrem Benatia
[1] Michael Garland,et al. Nitro: A Framework for Adaptive Code Variant Tuning , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[2] Jeffrey S. Vetter,et al. A Survey of Methods for Analyzing and Improving GPU Energy Efficiency , 2014, ACM Comput. Surv..
[3] Feng Shi,et al. Machine Learning Approach for the Predicting Performance of SpMV on GPU , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).
[4] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[5] Richard W. Vuduc,et al. Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..
[6] Oscar H. Ibarra,et al. Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.
[7] Eric S. Chung,et al. SpMV: A Memory-Bound Application on the GPU Stuck Between a Rock and a Hard Place , 2012 .
[8] Srinivasan Parthasarathy,et al. Automatic Selection of Sparse Matrix Representation on GPUs , 2015, ICS.
[9] Davide Barbieri,et al. Sparse Matrix-Vector Multiplication on GPGPUs , 2017, ACM Trans. Math. Softw..
[10] Pavel Tvrdík,et al. Evaluation Criteria for Sparse Matrix Storage Formats , 2016, IEEE Transactions on Parallel and Distributed Systems.
[11] He Huang,et al. A model-driven partitioning and auto-tuning integrated framework for sparse matrix-vector multiplication on GPUs , 2011 .
[12] Ping Guo,et al. A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.
[13] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..
[14] Wee Siong Lim. Optimizing sparse matrix kernels on coprocessors , 2014 .
[15] Laxmikant V. Kale,et al. Accelerator Support in the Charm++ Parallel Programming Model. , 2010 .
[16] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[17] Shengen Yan,et al. yaSpMV: yet another SpMV framework on GPUs , 2014, PPoPP.
[18] Kenli Li,et al. Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs , 2015, IEEE Transactions on Computers.
[19] Jie Shen,et al. Workload Partitioning for Accelerating Applications on Heterogeneous Platforms , 2016, IEEE Transactions on Parallel and Distributed Systems.
[20] K. Srinathan,et al. A performance prediction model for the CUDA GPGPU platform , 2009, 2009 International Conference on High Performance Computing (HiPC).
[21] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..
[22] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[23] Satoshi Matsuoka,et al. Cache-aware sparse matrix formats for Kepler GPU , 2014, 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS).
[24] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[25] Jack Dongarra,et al. Scientific Computing with Multicore and Accelerators , 2010, Chapman and Hall / CRC computational science series.
[26] Hyesoon Kim,et al. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.
[27] Arutyun Avetisyan,et al. Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures , 2010, HiPEAC.
[28] Ladislau Bölöni,et al. A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..
[29] Kenli Li,et al. A hybrid computing method of SpMV on CPU-GPU heterogeneous computing systems , 2017, J. Parallel Distributed Comput..
[30] Alexander J. Smola,et al. Support Vector Regression Machines , 1996, NIPS.
[31] Bertil Schmidt,et al. CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations , 2013, Parallel Comput..
[32] Kurt Keutzer,et al. clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs , 2012, ICS '12.
[33] Richard W. Vuduc,et al. Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.