Exploiting compression opportunities to improve SpMxV performance on shared memory systems
暂无分享,去创建一个
[1] Roman Geus,et al. Towards a fast parallel sparse matrix-vector multiplication , 2000, PARCO.
[2] Stamatis Vassiliadis,et al. Parallel Computer Architecture , 2000, Euro-Par.
[3] Katherine Yelick,et al. Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply , 2004 .
[4] Francisco F. Rivera,et al. Improving the locality of the sparse matrix-vector product on shared memory multiprocessors , 2004, 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings..
[5] A. Pinar,et al. Improving Performance of Sparse Matrix-Vector Multiplication , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[6] Olivier Temam,et al. Characterizing the behavior of sparse algorithms on caches , 1992, Proceedings Supercomputing '92.
[7] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .
[8] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[9] J. Dongarra,et al. Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems) , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[10] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[11] Richard Barrett,et al. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.
[12] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.
[13] David E. Keyes,et al. Four Horizons for Enhancing the Performance of Parallel Simulations Based on Partial Differential Equations , 2000, Euro-Par.
[14] Katherine A. Yelick,et al. Optimizing Sparse Matrix Vector Multiplication on SMP , 1999, SIAM Conference on Parallel Processing for Scientific Computing.
[15] Hyun Jin Moon,et al. Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure , 2005, HPCC.
[16] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[17] Youcef Saad,et al. A Basic Tool Kit for Sparse Matrix Computations , 1990 .
[18] Calvin J. Ribbens,et al. Pattern-based sparse matrix representation for memory-efficient SMVM kernels , 2009, ICS.
[19] Nectarios Koziris,et al. Improving the Performance of Multithreaded Sparse Matrix-Vector Multiplication Using Index and Value Compression , 2008, 2008 37th International Conference on Parallel Processing.
[20] Martin Burtscher,et al. High Throughput Compression of Double-Precision Floating-Point Data , 2007, 2007 Data Compression Conference (DCC'07).
[21] Nectarios Koziris,et al. Understanding the Performance of Sparse Matrix-Vector Multiplication , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).
[22] Ümit V. Çatalyürek,et al. Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication , 1996, IRREGULAR.
[23] W. K. Anderson,et al. Achieving High Sustained Performance in an Unstructured Mesh CFD Application , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[24] Sivan Toledo,et al. Improving the memory-system performance of sparse-matrix vector multiplication , 1997, IBM J. Res. Dev..
[25] James Demmel,et al. Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..
[26] David Moloney,et al. Streaming Sparse Matrix Compression/Decompression , 2005, HiPEAC.
[27] Nectarios Koziris,et al. Optimizing sparse matrix-vector multiplication using index and value compression , 2008, CF '08.
[28] E. Im,et al. Optimizing Sparse Matrix Vector Multiplication on SMP , 1999, PPSC.
[29] P. Sadayappan,et al. On improving the performance of sparse matrix-vector multiplication , 1997, Proceedings Fourth International Conference on High-Performance Computing.
[30] Anoop Gupta,et al. Parallel computer architecture - a hardware / software approach , 1998 .
[31] Nectarios Koziris,et al. Performance evaluation of the sparse matrix-vector multiplication on modern architectures , 2009, The Journal of Supercomputing.
[32] Martin Hopkins,et al. Synergistic Processing in Cell's Multicore Architecture , 2006, IEEE Micro.
[33] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[34] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[35] Michael Lang,et al. A Performance Evaluation of the Nehalem Quad-Core Processor for Scientific Computing , 2008, Parallel Process. Lett..
[36] John M. Mellor-Crummey,et al. Optimizing Sparse Matrix–Vector Product Computations Using Unroll and Jam , 2004, Int. J. High Perform. Comput. Appl..
[37] D. Geer,et al. Chip makers turn to multicore processors , 2005, Computer.
[38] Andrew Lumsdaine,et al. Accelerating sparse matrix computations via data compression , 2006, ICS '06.
[39] Srihari Makineni,et al. Exploring the cache design space for large scale CMPs , 2005, CARN.
[40] James Demmel,et al. Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply , 2002, ACM/IEEE SC 2002 Conference (SC'02).