SparseX: A Library for High-Performance Sparse Matrix-Vector Multiplication on Multicore Platforms
暂无分享,去创建一个
Nectarios Koziris | Vasileios Karakasis | Georgios I. Goumas | Kornilios Kourtis | Athena Elafrou | Theodoros Gkountouvas | N. Koziris | K. Kourtis | Theo Gkountouvas | G. Goumas | V. Karakasis | Athena Elafrou
[1] Maria Ganzha,et al. Utilizing Recursive Storage in Sparse Matrix-Vector Multiplication - Preliminary Considerations , 2010, CATA.
[2] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[3] Calvin J. Ribbens,et al. Pattern-based sparse matrix representation for memory-efficient SMVM kernels , 2009, ICS.
[4] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[5] Victor Eijkhout,et al. An iterative solver benchmark , 2001, Sci. Program..
[6] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[7] Gerhard Wellein,et al. A Unified Sparse Matrix Data Format for Efficient General Sparse Matrix-Vector Multiplication on Modern Processors with Wide SIMD Units , 2013, SIAM J. Sci. Comput..
[8] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.
[9] Vicente H. F. Batista,et al. Parallel structurally-symmetric sparse matrix-vector products on multi-core processors , 2010, ArXiv.
[10] Tamara G. Kolda,et al. An overview of the Trilinos project , 2005, TOMS.
[11] Nectarios Koziris,et al. Performance evaluation of the sparse matrix-vector multiplication on modern architectures , 2009, The Journal of Supercomputing.
[12] Ninghui Sun,et al. SMAT: an input adaptive auto-tuner for sparse matrix-vector multiplication , 2013, PLDI.
[13] Frederico Pratas,et al. Cache-aware Roofline model: Upgrading the loft , 2014, IEEE Computer Architecture Letters.
[14] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[15] Hyun Jin Moon,et al. Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure , 2005, HPCC.
[16] Ankit Jain. pOSKI : An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures , 2008 .
[17] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[18] Wolfram Schenck,et al. Performance Evaluation of Scientific Applications on POWER8 , 2014, PMBS@SC.
[19] Sandia Report,et al. Improving Performance via Mini-applications , 2009 .
[20] Ramesh C. Agarwal,et al. A high performance algorithm using pre-processing for the sparse matrix-vector multiplication , 1992, Proceedings Supercomputing '92.
[21] Mark J. Harris. Mapping computational concepts to GPUs , 2005, SIGGRAPH Courses.
[22] Endong Wang,et al. Intel Math Kernel Library , 2014 .
[23] Nectarios Koziris,et al. CSX: an extended compression format for spmv on shared memory systems , 2011, PPoPP '11.
[24] Nectarios Koziris,et al. Optimizing sparse matrix-vector multiplication using index and value compression , 2008, CF '08.
[25] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[26] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[27] P. Sadayappan,et al. On improving the performance of sparse matrix-vector multiplication , 1997, Proceedings Fourth International Conference on High-Performance Computing.
[28] Nectarios Koziris,et al. An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication , 2013, IEEE Transactions on Parallel and Distributed Systems.
[29] M. Gutknecht. BLOCK KRYLOV SPACE METHODS FOR LINEAR SYSTEMS WITH MULTIPLE RIGHT-HAND SIDES : AN , 2005 .
[30] John K. Reid,et al. Some Design Features of a Sparse Matrix Code , 1979, TOMS.
[31] R. F. Boisvert,et al. The Matrix Market Exchange Formats: Initial Design | NIST , 1996 .
[32] Brian Vinter,et al. CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication , 2015, ICS.
[33] J. W. Walker,et al. Direct solutions of sparse network equations by optimally ordered triangular factorization , 1967 .
[34] Arturo González-Escribano,et al. Blending Extensibility and Performance in Dense and Sparse Parallel Data Management , 2014, IEEE Transactions on Parallel and Distributed Systems.
[35] E. Cuthill,et al. Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.
[36] Andrew Lumsdaine,et al. Accelerating sparse matrix computations via data compression , 2006, ICS '06.
[37] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[38] D. Sorensen. Numerical methods for large eigenvalue problems , 2002, Acta Numerica.
[39] Sandia Report,et al. Toward a New Metric for Ranking High Performance Computing Systems , 2013 .
[40] Samuel Williams,et al. Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .
[41] Nectarios Koziris,et al. Improving the Performance of the Symmetric Sparse Matrix-Vector Multiplication in Multicore , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[42] C.A. Beattie,et al. Inexact Solves in Krylov-based Model Reduction , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.
[43] Gerhard Wellein,et al. LIKWID: Lightweight Performance Tools , 2011, CHPC.
[44] Adrian E. Raftery,et al. Weather Forecasting with Ensemble Methods , 2005, Science.
[45] Joseph L. Greathouse,et al. Efficient Sparse Matrix-Vector Multiplication on GPUs Using the CSR Storage Format , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[46] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[47] Udo W. Pooch,et al. A Survey of Indexing Techniques for Sparse Matrices , 1973, CSUR.