Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi
暂无分享,去创建一个
[1] James Demmel,et al. When cache blocking of sparse matrix vector multiply works and why , 2007, Applicable Algebra in Engineering, Communication and Computing.
[2] Ankit Jain. pOSKI : An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures , 2008 .
[3] Zheng Zhou,et al. An Out-of-Core Eigensolver on SSD-equipped Clusters , 2012, 2012 IEEE International Conference on Cluster Computing.
[4] John R. Gilbert,et al. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks , 2009, SPAA '09.
[5] Louis-Noël Pouchet,et al. Automatic Transformations for Effective Parallel Execution on Intel Many Integrated Core , 2012 .
[6] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.
[7] D. Panda,et al. Intra-MIC MPI Communication using MVAPICH 2 : Early Experience , 2012 .
[8] Youcef Saad,et al. A Basic Tool Kit for Sparse Matrix Computations , 1990 .
[9] Samuel Williams,et al. Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[10] Ümit V. Çatalyürek,et al. Fast Recommendation on Bibliographic Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.
[11] Ümit V. Çatalyürek,et al. An Early Evaluation of the Scalability of Graph Algorithms on the Intel MIC Architecture , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[12] Michael Klemm,et al. OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison , 2012, MARC@RWTH.
[13] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[14] Marcin Dabrowski,et al. Parallel symmetric sparse matrix-vector product on scalar multi-core CPUs , 2010, Parallel Comput..
[15] E. Cuthill,et al. Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.
[16] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[17] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[18] John M. Mellor-Crummey,et al. Optimizing Sparse Matrix–Vector Product Computations Using Unroll and Jam , 2004, Int. J. High Perform. Comput. Appl..