Specializing Compiler Optimizations through Programmable Composition for Dense Matrix Computations
暂无分享,去创建一个
Qian Wang | Qing Yi | Huimin Cui | Qing Yi | Huimin Cui | Qian Wang
[1] Chun Chen,et al. ECO: an empirical-based compilation and optimization system , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[2] Qian Wang,et al. AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[3] Keshav Pingali,et al. Synthesizing Transformations for Locality Enhancement of Imperfectly-Nested Loop Nests , 2001, International Journal of Parallel Programming.
[4] Jaewook Shin,et al. Superword-level parallelism in the presence of control flow , 2005, International Symposium on Code Generation and Optimization.
[5] Qing Yi,et al. POET: a scripting language for applying parameterized source‐to‐source program transformations , 2012, Softw. Pract. Exp..
[6] Chun Chen,et al. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy , 2005, International Symposium on Code Generation and Optimization.
[7] Rudolf Eigenmann,et al. Fast, automatic, procedure-level performance tuning , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[8] R. C. Whaley,et al. Automated transformation for performance-critical kernels , 2007, LCSD '07.
[9] Jichi Guo,et al. Automated empirical tuning of scientific codes for performance and power consumption , 2011, HiPEAC.
[10] Ken Kennedy,et al. Improving the ratio of memory operations to floating-point operations in loops , 1994, TOPL.
[11] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[12] Yunheung Paek,et al. Finding effective optimization phase sequences , 2003, LCTES '03.
[13] Dongrui Fan,et al. Extendable pattern-oriented optimization directives , 2012, International Symposium on Code Generation and Optimization (CGO 2011).
[14] Monica S. Lam,et al. An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.
[15] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[16] Allen,et al. Optimizing Compilers for Modern Architectures , 2004 .
[17] Ken Kennedy,et al. Scalar replacement in the presence of conditional control flow , 1994, Softw. Pract. Exp..
[18] Keith D. Cooper,et al. Engineering a Compiler , 2003 .
[19] Richard Henderson,et al. Multi-platform auto-vectorization , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[20] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[21] Ken Kennedy,et al. Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .
[22] Elizabeth R. Jessup,et al. Automating the generation of composed linear algebra kernels , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[23] Qing Yi,et al. Automated programmable control and parameterization of compiler optimizations , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[24] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[25] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.
[26] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.
[27] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[28] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[29] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[30] David Parello,et al. Facilitating the search for compositions of program transformations , 2005, ICS '05.