Optimization by runtime specialization for sparse matrix-vector multiplication
暂无分享,去创建一个
Samuel N. Kamin | María Jesús Garzarán | Danqing Xu | Buse Yilmaz | Barış Aktemur | Zhongbo Chen | M. Garzarán | Baris Aktemur | Danqing Xu | Buse Yilmaz | Zhongbo Chen
[1] Frank Pfenning,et al. A modal analysis of staged computation , 1996, POPL '96.
[2] Matteo Frigo,et al. A fast Fourier transform compiler , 1999, SIGP.
[3] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[4] James Demmel,et al. Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[5] Walid Taha,et al. Environment classifiers , 2003, POPL.
[6] Samuel N. Kamin,et al. Jumbo: run-time code generation for Java and its applications , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[7] Dan Grossman,et al. Compiling for template-based run-time code generation , 2003, Journal of Functional Programming.
[8] Richard W. Vuduc,et al. Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..
[9] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[10] Albert Cohen,et al. Towards a High-Productivity and High-Performance Marshaling Library for Compound Data , 2005 .
[11] Eduardo F. D'Azevedo,et al. Vectorized Sparse Matrix Multiply for Compressed Row Storage Format , 2005, International Conference on Computational Science.
[12] Victor Eijkhout,et al. Self-Adapting Linear Algebra Algorithms and Software , 2005, Proceedings of the IEEE.
[13] David A. Padua,et al. Optimizing sorting with genetic algorithms , 2005, International Symposium on Code Generation and Optimization.
[14] Samuel N. Kamin,et al. Optimizing marshalling by run-time program generation , 2005, GPCE'05.
[15] David A. Padua,et al. In search of a program generator to implement generic transformations for high-performance computing , 2006, Sci. Comput. Program..
[16] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[17] Ankit Jain. pOSKI : An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures , 2008 .
[18] John R. Gilbert,et al. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks , 2009, SPAA '09.
[19] Calvin J. Ribbens,et al. Pattern-based sparse matrix representation for memory-efficient SMVM kernels , 2009, ICS.
[20] Walid Taha,et al. Mint: Java multi-stage programming using weak separability , 2010, PLDI '10.
[21] Samuel Williams,et al. Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[22] Makoto Tatsuta,et al. Static analysis of multi-staged programs via unstaging translation , 2011, POPL '11.
[23] Jacques Carette,et al. Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code , 2005, GPCE'05.
[24] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[25] Nectarios Koziris,et al. CSX: an extended compression format for spmv on shared memory systems , 2011, PPoPP '11.
[26] Luke N. Olson,et al. Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods , 2012, SIAM J. Sci. Comput..
[27] Chung-chieh Shan,et al. Shonan challenge for generative programming: short position paper , 2013, PEPM '13.
[28] Luke N. Olson,et al. Optimizing Sparse Matrix—Matrix Multiplication for the GPU , 2015, ACM Trans. Math. Softw..