Tiramisu : A Polyhedral Compiler with A Scheduling Language for Targeting High Performance Systems Riyadh
暂无分享,去创建一个
Shoaib Kamil | Emanuele Del Sozzo | Riyadh Baghdadi | Yunming Zhang | Saman Amarasinghe | Saman P. Amarasinghe | Patricia Suriana | Malek Ben Romdhane | Jessica Ray | Abdurrahman Akkas | Yunming Zhang | S. Kamil | Patricia Suriana | Riyadh Baghdadi | Jessica Ray | Abdurrahman Akkas | Shoaib Kamil
[1] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[2] Monica S. Lam,et al. Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.
[3] Alain Darte,et al. New Complexity Results on Array Contraction and Related Problems , 2005, J. VLSI Signal Process..
[4] Paul Feautrier,et al. Automatic Storage Management for Parallel Programs , 1998, Parallel Comput..
[5] Albert Cohen,et al. GRAPHITE Two Years After First Lessons Learned From Real-World Polyhedral Compilation , 2010 .
[6] Elnar Hajiyev,et al. PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[7] Marco D. Santambrogio,et al. A Unified Backend for Targeting FPGAs from DSLs , 2018, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[8] Christian Lengauer,et al. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation , 2012, Parallel Process. Lett..
[9] Uday Bondhugula,et al. Effective automatic parallelization of stencil computations , 2007, PLDI '07.
[10] Manish Gupta,et al. On privatization of variables for data-parallel execution , 1997, Proceedings 11th International Parallel Processing Symposium.
[11] Frédo Durand,et al. Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..
[12] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[13] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[14] Wei Huang,et al. Design of High Performance MVAPICH2: MPI2 over InfiniBand , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).
[15] Cédric Bastoul. Code Generation in the Polyhedral Model Is Easier Than You Think , 2004, IEEE PACT.
[16] Sanjay V. Rajopadhye,et al. Optimizing memory usage in the polyhedral model , 2000, TOPL.
[17] Chun Chen,et al. Loop Transformation Recipes for Code Generation and Auto-Tuning , 2009, LCPC.
[18] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI.
[19] Paul Feautrier,et al. Polyhedron Model , 2011, Encyclopedia of Parallel Computing.
[20] Frédéric Vivien,et al. A unified framework for schedule and storage optimization , 2001, PLDI '01.
[21] Shoaib Kamil,et al. Distributed Halide , 2016, PPoPP.
[22] David Parello,et al. Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies , 2006, International Journal of Parallel Programming.
[23] Tomofumi Yuki,et al. AlphaZ: A System for Design Space Exploration in the Polyhedral Model , 2012, LCPC.
[24] Uday Bondhugula,et al. Loop transformations: convexity, pruning and optimization , 2011, POPL '11.
[25] Albert Cohen,et al. Hybrid Hexagonal/Classical Tiling for GPUs , 2014, CGO '14.
[26] Paul Feautrier,et al. Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.
[27] Monica S. Lam,et al. Array-data flow analysis and its use in array privatization , 1993, POPL '93.
[28] Uday Bondhugula. Compiling affine loop nests for distributed-memory parallel architectures , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[29] Uday Bondhugula,et al. PolyMage: Automatic Optimization for Image Processing Pipelines , 2015, ASPLOS.
[30] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.
[31] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.
[32] Lawrence G. Roberts,et al. Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.
[33] David A. Padua,et al. Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.
[34] Albert Cohen,et al. The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.
[35] Sven Verdoolaege,et al. isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.
[36] Mary W. Hall,et al. CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .
[37] Monica S. Lam,et al. Data Dependence and Data-Flow Analysis of Arrays , 1992, LCPC.
[38] Zhiyuan Li. Array privatization for parallel execution of loops , 1992, International Conference on Supercomputing.