Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver
暂无分享,去创建一个
Pradeep Dubey | Mikhail Smelyanskiy | Jongsoo Park | Narayanan Sundaram | P. Dubey | Jongsoo Park | M. Smelyanskiy | N. Sundaram
[1] Jack Dongarra,et al. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .
[2] João Correia Lopes,et al. High Performance Computing for Computational Science - VECPAR 2010 - 9th International conference, Berkeley, CA, USA, June 22-25, 2010, Revised Selected Papers , 2011, VECPAR.
[3] E. L. Poole,et al. Multicolor ICCG methods for vector computers , 1987 .
[4] Anoop Gupta,et al. Parallel ICCG on a hierarchical memory multiprocessor - Addressing the triangular solve bottleneck , 1990, Parallel Comput..
[5] Hong Zhang,et al. Sparse triangular solves for ILU revisited: data layout crucial to better performance , 2011, Int. J. High Perform. Comput. Appl..
[6] J. Meijerink,et al. An iterative solution method for linear systems of which the coefficient matrix is a symmetric -matrix , 1977 .
[7] V. E. Henson,et al. BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .
[8] Ronald L. Graham,et al. Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.
[9] Debra Hensgen,et al. Two algorithms for barrier synchronization , 1988, International Journal of Parallel Programming.
[10] Pradeep Dubey,et al. Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[11] Tinkara Toš,et al. Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.
[12] Robert A. van de Geijn,et al. Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures , 2007, SPAA '07.
[13] Yousef Saad,et al. Solving Sparse Triangular Linear Systems on Parallel Computers , 1989, Int. J. High Speed Comput..
[14] Hiroshi Nakashima,et al. Algebraic Block Multi-Color Ordering Method for Parallel Multi-Threaded Sparse Triangular Solver in ICCG Method , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[15] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[16] Victor Eijkhout,et al. A Parallel Sparse Direct Solver via Hierarchical DAG Scheduling , 2014, ACM Trans. Math. Softw..
[17] Joel H. Saltz,et al. Run-time parallelization and scheduling of loops , 1989, SPAA '89.
[18] William J. Dally,et al. Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures , 2010, SPAA '10.
[19] Sandia Report,et al. Toward a New Metric for Ranking High Performance Computing Systems , 2013 .
[20] Harry T. Hsu,et al. An Algorithm for Finding a Minimal Equivalent Graph of a Digraph , 1975, JACM.
[21] Joel H. Saltz,et al. Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors , 1990, SIAM J. Sci. Comput..
[22] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[23] Jan Mayer,et al. Parallel algorithms for solving linear systems with sparse triangular matrices , 2009, Computing.
[24] Santa Clara,et al. Parallel Solution of Sparse Triangular Linear Systems in the Preconditioned Iterative Methods on the GPU , 2011 .
[25] Matthias S. Müller,et al. Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[26] T. C. Hu. Parallel Sequencing and Assembly Line Problems , 1961 .
[27] Erik G. Boman,et al. Factors Impacting Performance of Multithreaded Sparse Triangular Solve , 2010, VECPAR.