LIFT: A functional data-parallel IR for high-performance GPU code generation
暂无分享,去创建一个
Michel Steuwer | Christophe Dubach | Toomas Remmelg | Michel Steuwer | Christophe Dubach | Toomas Remmelg
[1] Kunle Olukotun,et al. Locality-Aware Mapping of Nested Parallel Patterns on GPUs , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[2] Manuel M. T. Chakravarty,et al. Accelerating Haskell array codes with multicore GPUs , 2011, DAMP '11.
[3] Kurt Keutzer,et al. Copperhead: compiling an embedded data parallel language , 2011, PPoPP '11.
[4] Trevor L. McDonell. Optimising purely functional GPU programs , 2013, ICFP.
[5] Murray Cole,et al. Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .
[6] Kunle Olukotun,et al. A Heterogeneous Parallel Framework for Domain-Specific Languages , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[7] Sam Lindley,et al. Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code , 2015, ICFP.
[8] Kunle Olukotun,et al. Have abstraction and eat performance, too: Optimized heterogeneous computing with parallel patterns , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[9] David F. Bacon,et al. Compiling a high-level language for GPUs: (via language support for architectures and compilers) , 2012, PLDI.
[10] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[11] Patrick Maier,et al. Towards an Adaptive Skeleton Framework for Performance Portability , 2015 .
[12] Kunle Olukotun,et al. Delite , 2014, ACM Trans. Embed. Comput. Syst..
[13] Martin Elsman,et al. Size slicing: a hybrid approach to size inference in futhark , 2014, FHPC '14.
[14] Sebastian Hack,et al. A graph-based higher-order intermediate representation , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[15] Abhishek Udupa,et al. Software Pipelined Execution of Stream Programs on GPUs , 2009, 2009 International Symposium on Code Generation and Optimization.
[16] Sergei Gorlatch,et al. SkelCL - A Portable Skeleton Library for High-Level GPU Programming , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[17] Thomas Fahringer,et al. INSPIRE: The insieme parallel intermediate representation , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[18] Elnar Hajiyev,et al. PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[19] Nathan Bell,et al. Thrust: A Productivity-Oriented Library for CUDA , 2012 .
[20] Frank Mueller,et al. Hidp: A hierarchical data parallel language , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[21] Sean Lee,et al. NOVA: A Functional Language for Data Parallelism , 2014, ARRAY@PLDI.