Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices
暂无分享,去创建一个
Pradeep Dubey | Yutong Lu | Alexander Heinecke | Karthikeyan Vaidyanathan | Mikhail Smelyanskiy | Xing Liu | Jongsoo Park | Md. Mostofa Ali Patwary | Dhiraj D. Kalamkar | P. Dubey | K. Vaidyanathan | Jongsoo Park | M. Smelyanskiy | Yutong Lu | A. Heinecke | Xing Liu
[1] Alex Pothen,et al. New Multithreaded Ordering and Coloring Algorithms for Multicore Architectures , 2011, Euro-Par.
[2] Luke N. Olson,et al. Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods , 2012, SIAM J. Sci. Comput..
[3] Arutyun Avetisyan,et al. Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures , 2010, HiPEAC.
[4] Rudolf Eigenmann,et al. Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems , 2008, ICS '08.
[5] George Karypis,et al. Multi-threaded Graph Partitioning , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[6] Erik Hagersten,et al. Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors , 2006, ICS '06.
[7] GrinspunEitan,et al. Sparse matrix solvers on the GPU , 2003 .
[8] Joel H. Saltz,et al. Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors , 1990, SIAM J. Sci. Comput..
[9] Sandia Report,et al. Toward a New Metric for Ranking High Performance Computing Systems , 2013 .
[10] Victor Eijkhout,et al. An iterative solver benchmark , 2001, Sci. Program..
[11] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[12] E. L. Poole,et al. Multicolor ICCG methods for vector computers , 1987 .
[13] Anoop Gupta,et al. Parallel ICCG on a hierarchical memory multiprocessor - Addressing the triangular solve bottleneck , 1990, Parallel Comput..
[14] Pradeep Dubey,et al. Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver , 2014, ISC.
[15] Alex Pothen,et al. ColPack: Software for graph coloring and related problems in scientific computing , 2013, TOMS.
[16] Yousef Saad,et al. GPU-accelerated preconditioned iterative linear solvers , 2013, The Journal of Supercomputing.
[17] Eitan Grinspun,et al. Sparse matrix solvers on the GPU: conjugate gradients and multigrid , 2003, SIGGRAPH Courses.
[18] LiRuipeng,et al. GPU-accelerated preconditioned iterative linear solvers , 2013 .
[19] Santa Clara,et al. Parallel Solution of Sparse Triangular Linear Systems in the Preconditioned Iterative Methods on the GPU , 2011 .
[20] Matthias S. Müller,et al. Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[21] Todd Gamblin,et al. Scaling Algebraic Multigrid Solvers: On the Road to Exascale , 2010, CHPC.
[22] Takeshi Iwashita,et al. Block Red-Black Ordering: A New Ordering Strategy for Parallelization of ICCG Method , 2004, International Journal of Parallel Programming.
[23] Alex Pothen,et al. What Color Is Your Jacobian? Graph Coloring for Computing Derivatives , 2005, SIAM Rev..
[24] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[25] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[26] H. Elman,et al. Ordering techniques for the preconditioned conjugate gradient method on parallel computers , 1989 .
[27] Samuel Williams,et al. Optimization of geometric multigrid for emerging multi- and manycore processors , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[28] V. E. Henson,et al. BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .
[29] Yousef Saad,et al. Solving Sparse Triangular Linear Systems on Parallel Computers , 1989, Int. J. High Speed Comput..
[30] Cevdet Aykanat,et al. Fast optimal load balancing algorithms for 1D partitioning , 2004, J. Parallel Distributed Comput..
[31] Hiroshi Nakashima,et al. Algebraic Block Multi-Color Ordering Method for Parallel Multi-Threaded Sparse Triangular Solver in ICCG Method , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[32] Xing Liu,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.
[33] Jan-Philipp Weiss,et al. Enhanced Parallel ILU(p)-based Preconditioners for Multi-core CPUs and GPUs -- The Power(q)-pattern Method , 2011 .
[34] Erik G. Boman,et al. Factors Impacting Performance of Multithreaded Sparse Triangular Solve , 2010, VECPAR.