Abstractions for Specifying Sparse Matrix Data Transformations
暂无分享,去创建一个
Mary W. Hall | Eddie C. Davis | M. Hall | M. Strout | M. Mohammadi | Payal Nandy | C. Olschanowsky | Wei He
[1] Joel H. Saltz,et al. Principles of runtime support for parallel processors , 1988, ICS '88.
[2] Joel H. Saltz,et al. The Preprocessed Doacross Loop , 1991, ICPP.
[3] Joel H. Saltz,et al. Runtime compilation techniques for data partitioning and communication schedule reuse , 1993, Supercomputing '93. Proceedings.
[4] Geoffrey C. Fox,et al. Supporting irregular distributions in FORTRAN 90D/HPF compilers , 1994 .
[5] William Pugh,et al. Nonlinear array dependence analysis , 1994 .
[6] Joel H. Saltz,et al. Programming Irregular Applications: Runtime Support, Compilation and Tools , 1997, Adv. Comput..
[7] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[8] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[9] L. Rauchwerger,et al. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..
[10] Ken Kennedy,et al. Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.
[11] Chau-Wen Tseng,et al. Exploiting locality for irregular scientific codes , 2006, IEEE Transactions on Parallel and Distributed Systems.
[12] Rudolf Eigenmann,et al. Optimizing irregular shared-memory applications for distributed-memory systems , 2006, PPoPP '06.
[13] Bo Wu,et al. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU , 2013, PPoPP '13.
[14] Mary W. Hall,et al. Non-affine Extensions to Polyhedral Code Generation , 2014, CGO '14.
[15] Mary W. Hall,et al. Loop and data transformations for sparse matrix code , 2015, PLDI.
[16] J. Ramanujam,et al. Distributed memory code generation for mixed Irregular/Regular computations , 2015, PPoPP.
[17] Keshav Pingali,et al. Synchronization Trade-Offs in GPU Implementations of Graph Algorithms , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[18] Larry Carter,et al. An approach for code generation in the Sparse Polyhedral Framework , 2016, Parallel Comput..
[19] Khalid Ahmad,et al. Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action , 2016, LCPC.
[20] Mary W. Hall,et al. Compiler Transformation to Generate Hybrid Sparse Computations , 2016, 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3).
[21] Hongbo Rong,et al. Automating Wavefront Parallelization for Sparse Matrix Computations , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[22] Manuel Selva,et al. Full runtime polyhedral optimizing loop transformations with the generation, instantiation, and scheduling of code‐bones , 2017, Concurr. Comput. Pract. Exp..