Tiramisu: A Code Optimization Framework for High Performance Systems
暂无分享,去创建一个
Shoaib Kamil | Emanuele Del Sozzo | Riyadh Baghdadi | Saman Amarasinghe | Saman P. Amarasinghe | Patricia Suriana | Malek Ben Romdhane | Jessica Ray | S. Kamil | Patricia Suriana | Riyadh Baghdadi | Jessica Ray | Shoaib Kamil
[1] Uday Bondhugula,et al. Effective automatic parallelization of stencil computations , 2007, PLDI '07.
[2] Frédéric Vivien,et al. A unified framework for schedule and storage optimization , 2001, PLDI '01.
[3] Uday Bondhugula. Compiling affine loop nests for distributed-memory parallel architectures , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[4] Tomofumi Yuki,et al. AlphaZ: A System for Design Space Exploration in the Polyhedral Model , 2012, LCPC.
[5] Chun Chen,et al. Loop Transformation Recipes for Code Generation and Auto-Tuning , 2009, LCPC.
[6] Albert Cohen,et al. The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.
[7] Monica S. Lam,et al. Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.
[8] Manish Gupta,et al. On privatization of variables for data-parallel execution , 1997, Proceedings 11th International Parallel Processing Symposium.
[9] Uday Bondhugula,et al. PolyMage: Automatic Optimization for Image Processing Pipelines , 2015, ASPLOS.
[10] Shoaib Kamil,et al. Distributed Halide , 2016, PPoPP.
[11] Albert Cohen,et al. Hybrid Hexagonal/Classical Tiling for GPUs , 2014, CGO '14.
[12] Paul Feautrier,et al. Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.
[13] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[14] Uday Bondhugula,et al. Loop transformations: convexity, pruning and optimization , 2011, POPL '11.
[15] Albert Cohen,et al. GRAPHITE Two Years After First Lessons Learned From Real-World Polyhedral Compilation , 2010 .
[16] Elnar Hajiyev,et al. PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[17] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[18] Mary W. Hall,et al. CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .
[19] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[20] Christian Lengauer,et al. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation , 2012, Parallel Process. Lett..
[21] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.
[22] Paul Feautrier,et al. Polyhedron Model , 2011, Encyclopedia of Parallel Computing.
[23] Sanjay V. Rajopadhye,et al. Optimizing memory usage in the polyhedral model , 2000, TOPL.
[24] Alain Darte,et al. New Complexity Results on Array Contraction and Related Problems , 2005, J. VLSI Signal Process..
[25] Lawrence G. Roberts,et al. Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.
[26] Frédo Durand,et al. Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..
[27] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[28] David Parello,et al. Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies , 2006, International Journal of Parallel Programming.
[29] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI.
[30] Marco D. Santambrogio,et al. A Unified Backend for Targeting FPGAs from DSLs , 2018, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[31] Cédric Bastoul. Code Generation in the Polyhedral Model Is Easier Than You Think , 2004, IEEE PACT.
[32] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[33] Paul Feautrier,et al. Automatic Storage Management for Parallel Programs , 1998, Parallel Comput..
[34] David A. Padua,et al. Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.
[35] Zhiyuan Li. Array privatization for parallel execution of loops , 1992, ICS.
[36] Sven Verdoolaege,et al. isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.
[37] Wei Huang,et al. Design of High Performance MVAPICH2: MPI2 over InfiniBand , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).
[38] Monica S. Lam,et al. Data Dependence and Data-Flow Analysis of Arrays , 1992, LCPC.