Sparse Matrix-Vector Multiplication on GPU
暂无分享,去创建一个
[1] Chih-Jen Lin,et al. Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..
[2] Jack Dongarra,et al. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .
[3] Robert D. Falgout,et al. hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.
[4] Christopher Ré,et al. Materialization optimizations for feature selection workloads , 2014, SIGMOD Conference.
[5] I. Reguly,et al. Efficient sparse matrix-vector multiplication on cache-based GPUs , 2012, 2012 Innovative Parallel Computing (InPar).
[6] B. Ribeiro,et al. GPUMLib : An Efficient Open-Source GPU Machine Learning Library , 2011 .
[7] P. Sadayappan,et al. High-performance sparse matrix-vector multiplication on GPUs for structured grid computations , 2012, GPGPU-5.
[8] John Nickolls,et al. Scalable parallel programming with CUDA introduction , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[9] Kurt Keutzer,et al. Fast support vector machine training and classification on graphics processors , 2008, ICML '08.
[10] P. Baldi,et al. Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.
[11] Jon Kleinberg,et al. Authoritative sources in a hyperlinked environment , 1999, SODA '98.
[12] Xing Liu,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.
[13] Y. Saad,et al. Krylov Subspace Methods on Supercomputers , 1989 .
[14] Christos Faloutsos,et al. Random walk with restart: fast solutions and applications , 2008, Knowledge and Information Systems.
[15] Tinkara Toš,et al. Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.
[16] Noel Lopes,et al. GPUMLib: A new Library to combine Machine Learning algorithms with Graphics Processing Units , 2010, 2010 10th International Conference on Hybrid Intelligent Systems.
[17] Toby Sharp,et al. Implementing Decision Trees and Forests on a GPU , 2008, ECCV.
[18] Rajesh Bordawekar,et al. Optimizing Sparse Matrix-Vector Multiplication on GPUs , 2009 .
[19] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
[20] Srinivasan Parthasarathy,et al. Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining , 2011, Proc. VLDB Endow..
[21] Shirish Tatikonda,et al. SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[22] Hai Jin,et al. Optimization of Sparse Matrix-Vector Multiplication with Variant CSR on GPUs , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.
[23] Jonathan D. Hogg. A Fast Dense Triangular Solve in CUDA , 2013, SIAM J. Sci. Comput..
[24] Chia-Hua Ho,et al. Large-scale linear support vector regression , 2012, J. Mach. Learn. Res..
[25] Eurípides Montagne,et al. An Alternative Compressed Storage Format for Sparse Matrices , 2003, ISCIS.
[26] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..
[27] James Demmel,et al. Fast Reproducible Floating-Point Summation , 2013, 2013 IEEE 21st Symposium on Computer Arithmetic.
[28] Roy H. Campbell,et al. A Parallel Implementation of K-Means Clustering on GPUs , 2008, PDPTA.
[29] Shengen Yan,et al. yaSpMV: yet another SpMV framework on GPUs , 2014, PPoPP.
[30] John F. Canny,et al. Big data analytics with small footprint: squaring the cloud , 2013, KDD.
[31] Yao Zhang,et al. Scan primitives for GPU computing , 2007, GH '07.
[32] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[33] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..
[34] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[35] Richard Vuduc,et al. Automatic performance tuning of sparse matrix kernels , 2003 .
[36] Shirish Tatikonda,et al. Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML , 2014, Proc. VLDB Endow..
[37] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[38] Olivier Chapelle,et al. Training a Support Vector Machine in the Primal , 2007, Neural Computation.
[39] Srinivasan Parthasarathy,et al. Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[40] Marco Rosa,et al. Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.
[41] Richard W. Vuduc,et al. Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.
[42] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[43] John Canny,et al. BIDMach: Large-scale Learning with Zero Memory Allocation , 2013 .
[44] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.