Multi-role SpTRSV on Sunway Many-Core Architecture
暂无分享,去创建一个
[1] Li Kenli,et al. Implementing Molecular Dynamics Simulation on Sunway TaihuLight System , 2016 .
[2] Yves Robert,et al. STS-k: a multilevel sparse triangular solution scheme for NUMA multicores , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[3] Chi Xue-bin,et al. Extreme-Scale Phase Field Simulations of Coarsening Dynamics on the Sunway TaihuLight Supercomputer , 2016 .
[4] Mahmut T. Kandemir,et al. Optimizing Data Layouts for Parallel Computation on Multicores , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[5] Guangwen Yang,et al. swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[6] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[7] Weiguo Liu,et al. 18.9-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of 18-Hz and 8-Meter Scenarios , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Wenguang Chen,et al. Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[9] Chao Yang,et al. 26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[10] Hongbo Rong,et al. Automating Wavefront Parallelization for Sparse Matrix Computations , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[11] Pradeep Dubey,et al. Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver , 2014, ISC.
[12] Thomas R. Gross,et al. Synchronized-by-Default Concurrency for Shared-Memory Systems , 2017, PPOPP.
[13] Yousef Saad,et al. Solving Sparse Triangular Linear Systems on Parallel Computers , 1989, Int. J. High Speed Comput..
[14] Chau-Wen Tseng,et al. Exploiting locality for irregular scientific codes , 2006, IEEE Transactions on Parallel and Distributed Systems.
[15] Idit Keidar,et al. SALSA: scalable and low synchronization NUMA-aware algorithm for producer-consumer pools , 2012, SPAA '12.
[16] Brian Vinter,et al. A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves , 2016, Euro-Par.
[17] David A. Padua,et al. Hydra: Automatic algorithm exploration from linear algebra equations , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[18] Jan Mayer,et al. Parallel algorithms for solving linear systems with sparse triangular matrices , 2009, Computing.
[19] Maurice Herlihy,et al. Concurrent Data Structures for Near-Memory Computing , 2017, SPAA.
[20] Joel H. Saltz,et al. Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors , 1990, SIAM J. Sci. Comput..
[21] Edmond Chow,et al. Iterative Sparse Triangular Solves for Preconditioning , 2015, Euro-Par.
[22] Mary W. Hall,et al. Loop and data transformations for sparse matrix code , 2015, PLDI.
[23] Erik G. Boman,et al. Factors Impacting Performance of Multithreaded Sparse Triangular Solve , 2010, VECPAR.
[24] Edmond Chow,et al. Fine-Grained Parallel Incomplete LU Factorization , 2015, SIAM J. Sci. Comput..