STOMP: Statistical Techniques for Optimizing and Modeling Performance of Blocked Sparse Matrix Vector Multiplication
暂无分享,去创建一个
[1] Henk A. van der Vorst,et al. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..
[2] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[3] Timothy A. Davis,et al. A column approximate minimum degree ordering algorithm , 2000, TOMS.
[4] P. Sadayappan,et al. An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs , 2014, ICS '14.
[5] Srinivasan Parthasarathy,et al. Automatic Selection of Sparse Matrix Representation on GPUs , 2015, ICS.
[6] Victor Eijkhout,et al. Performance Optimization and Modeling of Blocked Sparse Kernels , 2007, Int. J. High Perform. Comput. Appl..
[7] William Gropp,et al. PETSc Users Manual Revision 3.4 , 2016 .
[8] Richard W. Vuduc,et al. Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..
[9] Y. Saad,et al. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .
[10] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[11] Brian Vinter,et al. CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication , 2015, ICS.
[12] Richard Vuduc,et al. Automatic performance tuning of sparse matrix kernels , 2003 .
[13] Nectarios Koziris,et al. A Comparative Study of Blocking Storage Methods for Sparse Matrices on Multicore Architectures , 2009, 2009 International Conference on Computational Science and Engineering.
[14] Richard W. Vuduc,et al. Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.
[15] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[16] A. Pinar,et al. Improving Performance of Sparse Matrix-Vector Multiplication , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[17] Ping Guo,et al. A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.
[18] A. H. Sherman,et al. Comparative Analysis of the Cuthill–McKee and the Reverse Cuthill–McKee Ordering Algorithms for Sparse Matrices , 1976 .
[19] Nectarios Koziris,et al. Understanding the Performance of Sparse Matrix-Vector Multiplication , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).
[20] Matthew G. Knepley,et al. PETSc Users Manual: Revision 3.11 , 2019 .
[21] Laura Grigori,et al. A New Scheduling Algorithm for Parallel Sparse LU Factorization with Static Pivoting , 2002, ACM/IEEE SC 2002 Conference (SC'02).