Towards High-Performance Code Generation for Multi-GPU Clusters Based on a Domain-Specific Language for Algorithmic Skeletons
暂无分享,去创建一个
[1] Sam Lindley,et al. Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code , 2015, ICFP.
[2] Herbert Kuchen,et al. Musket: a domain-specific language for high-level parallel programming with algorithmic skeletons , 2019, SAC.
[3] Herbert Kuchen,et al. Optimizing Sequences of Skeleton Calls , 2003, Domain-Specific Program Generation.
[4] Herbert Kuchen,et al. Algorithmic skeletons for multi-core, multi-GPU systems and clusters , 2012, Int. J. High Perform. Comput. Netw..
[5] Herbert Kuchen,et al. Generation of high-performance code based on a domain-specific language for algorithmic skeletons , 2019, The Journal of Supercomputing.
[6] Volker Gruhn,et al. Model-Driven Software Development , 2005 .
[7] Herbert Kuchen,et al. A Skeleton Library , 2002, Euro-Par.
[8] Christoph W. Kessler,et al. SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems , 2018, International Journal of Parallel Programming.
[9] Murray Cole,et al. Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .
[10] Marco Danelutto,et al. Parallel Patterns for General Purpose Many-Core , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[11] LindleySam,et al. Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code , 2015 .
[12] M Mernik,et al. When and how to develop domain-specific languages , 2005, CSUR.
[13] Hannes Schwarz,et al. Model-Driven Software Development , 2013 .
[14] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[15] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[16] Marco Danelutto,et al. FastFlow: High-level and Efficient Streaming on Multi-core , 2017 .
[17] Kiminori Matsuzaki,et al. Implementing Fusion-Equipped Parallel Skeletons by Expression Templates , 2009, IFL.
[18] Nathan Bell,et al. Thrust: A Productivity-Oriented Library for CUDA , 2012 .
[19] Herbert Kuchen,et al. Data Parallel Algorithmic Skeletons with Accelerator Support , 2017, International Journal of Parallel Programming.
[20] Peter Kilpatrick,et al. Targeting Distributed Systems in FastFlow , 2012, Euro-Par Workshops.
[21] Other Contributors Are Indicated Where They Contribute. The Eclipse Foundation , 2017 .
[22] Marco Danelutto,et al. SPar: A DSL for High-Level and Productive Stream Parallelism , 2017, Parallel Process. Lett..