pOSKI : An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures
暂无分享,去创建一个
[1] Youcef Saad,et al. A Basic Tool Kit for Sparse Matrix Computations , 1990 .
[2] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[3] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[4] Eun Im,et al. Optimizing the Performance of Sparse Matrix-Vector Multiplication , 2000 .
[5] Dragan Mirkovic,et al. An adaptive software library for fast Fourier transforms , 2000, ICS '00.
[6] José M. F. Moura,et al. Fast Automatic Generation of DSP Algorithms , 2001, International Conference on Computational Science.
[7] James Demmel,et al. Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[8] Richard Vuduc,et al. Automatic performance tuning of sparse matrix kernels , 2003 .
[9] Robert Love,et al. Linux Kernel Development , 2003 .
[10] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[11] Katherine Yelick,et al. Performance Modeling and Analysis of Cache Blocking in Sparse Matrix Vector Multiply , 2004 .
[12] Monica S. Lam,et al. RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines , 1998 .
[13] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[14] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[15] Andrew Lumsdaine,et al. Accelerating sparse matrix computations via data compression , 2006, ICS '06.
[16] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[17] James Demmel,et al. When cache blocking of sparse matrix vector multiply works and why , 2007, Applicable Algebra in Engineering, Communication and Computing.
[18] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..