Sparse Tiling for Stationary Iterative Methods
暂无分享,去创建一个
Larry Carter | Jeanne Ferrante | Michelle Mills Strout | Barbara Kreaseck | L. Carter | J. Ferrante | M. Strout | Barbara Kreaseck
[1] C. A. R. HOARE,et al. An axiomatic basis for computer programming , 1969, CACM.
[2] Dennis Gannon,et al. Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..
[3] Eun Im,et al. Optimizing the Performance of Sparse Matrix-Vector Multiplication , 2000 .
[4] George Karypis,et al. Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..
[5] William Pugh,et al. A unifying framework for iteration reordering transformations , 1995, Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing.
[6] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[7] Larry Carter,et al. Rescheduling for Locality in Sparse Matrix Computations , 2001, International Conference on Computational Science.
[8] Zhiyuan Li,et al. New tiling techniques to improve cache temporal locality , 1999, PLDI '99.
[9] Keshav Pingali,et al. Synthesizing transformations for locality enhancement of imperfectly-nested loop nests , 2000 .
[10] Michael Wolfe,et al. Iteration Space Tiling for Memory Hierarchies , 1987, PPSC.
[11] Kang Su Gatlin,et al. Architecture-Cognizant Divide and Conquer Algorithms , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[12] David S. Johnson,et al. Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..
[13] David G. Wonnacott,et al. Achieving Scalable Locality with Time Skewing , 2002, International Journal of Parallel Programming.
[14] Joel H. Saltz,et al. Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures , 1994, J. Parallel Distributed Comput..
[15] Ulrich Rüde,et al. Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .
[16] Joel H. Saltz,et al. Principles of runtime support for parallel processors , 1988, ICS '88.
[17] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[18] William Pugh,et al. Finding Legal Reordering Transformations Using Mappings , 1994, LCPC.
[19] Larry Carter,et al. Combining Performance Aspects of Irregular Gauss-Seidel Via Sparse Tiling , 2002, LCPC.
[20] C. H. Wu. A multicolour SOR method for the finite-element method , 1990 .
[21] Ken Kennedy,et al. Compiler blockability of numerical algorithms , 1992, Proceedings Supercomputing '92.
[22] William Pugh,et al. Nonlinear array dependence analysis , 1994 .
[23] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[24] Mark F. Adams. A distributed memory unstructured gauss-seidel algorithm for multigrid smoothers , 2001, SC.
[25] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[26] Larry Carter,et al. Performance transformations for irregular applications , 2003 .
[27] Robert J. Fowler,et al. Increasing Temporal Locality with Skewing and Recursive Blocking , 2001, International Conference on Software Composition.
[28] William Pugh,et al. SIPR: A New Framework for Generating Efficient Code for Sparse Matrix Computations , 1998, LCPC.
[29] Siddhartha Chatterjee,et al. Cache-Efficient Multigrid Algorithms , 2004, Int. J. High Perform. Comput. Appl..
[30] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[31] Kei Davis,et al. Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures , 1998, ISCOPE.
[32] Chau-Wen Tseng,et al. Efficient compiler and run-time support for parallel irregular reductions , 2000, Parallel Comput..
[33] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[34] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[35] William Pugh,et al. Iteration Space Slicing for Locality , 1999, LCPC.
[36] Jack J. Dongarra,et al. End-user Tools for Application Performance Analysis Using Hardware Counters , 2001, ISCA PDCS.
[37] Richard Barrett,et al. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.
[38] C. A. R. Hoare,et al. An axiomatic basis for computer programming , 1969, CACM.
[39] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.
[40] Ken Kennedy,et al. Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.
[41] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.