Low Thread-Count Gustavson: A Multithreaded Algorithm for Sparse Matrix-Matrix Multiplication Using Perfect Hashing
暂无分享,去创建一个
[1] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[2] Jonathan J. Hu,et al. Design considerations for a flexible multigrid preconditioning library , 2012, Sci. Program..
[3] Fred G. Gustavson,et al. Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition , 1978, TOMS.
[4] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[5] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[6] Satoshi Matsuoka,et al. High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures , 2018, ICPP Workshops.
[7] Mark Frederick Hoemmen,et al. An Overview of Trilinos , 2003 .
[8] Hans-Peter Seidel,et al. Adaptive sparse matrix-matrix multiplication on the GPU , 2019, PPoPP.
[9] Mehmet Deveci,et al. Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[10] Satoshi Matsuoka,et al. High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU , 2017, 2017 46th International Conference on Parallel Processing (ICPP).
[11] Luke N. Olson,et al. Optimizing Sparse Matrix—Matrix Multiplication for the GPU , 2015, ACM Trans. Math. Softw..