Assessing the performance portability of modern parallel programming models using TeaLeaf
暂无分享,去创建一个
Matt Martineau | Simon McIntosh-Smith | Wayne P. Gaudin | W. Gaudin | Simon N. McIntosh-Smith | Matt Martineau
[1] Seyong Lee,et al. Early evaluation of directive-based GPU programming models for productive exascale computing , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Matthias S. Müller,et al. OpenMP in the Era of Low Power Devices and Accelerators , 2013, Lecture Notes in Computer Science.
[3] Yao Zhang,et al. Improving Performance Portability in OpenCL Programs , 2013, ISC.
[4] Christian Terboven,et al. A Pattern-Based Comparison of OpenACC and OpenMP for Accelerator Computing , 2014, Euro-Par.
[5] Sandia Report,et al. Improving Performance via Mini-applications , 2009 .
[6] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[7] Kevin Skadron,et al. A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..
[8] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[9] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.
[10] Arthur W. Toga,et al. CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms , 2012, Comput. Methods Programs Biomed..
[11] Jun Kong,et al. Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[12] Stephen A. Jarvis,et al. Towards Portable Performance for Explicit Hydrodynamics Codes , 2013 .
[13] D. A. Beckingsale,et al. TeaLeaf: A New Mini-Application for Many-Core Aware, Iterative Sparse Linear Solvers , 2015, IPDPS 2015.
[14] Karl Rupp,et al. Performance portability study of linear algebra kernels in OpenCL , 2014, IWOCL '14.
[15] Simon McIntosh-Smith,et al. The OPS Domain Specific Abstraction for Multi-block Structured Grid Computations , 2014, 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing.
[16] Jack J. Dongarra,et al. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..
[17] Richard D. Hornung,et al. The RAJA Portability Layer: Overview and Status , 2014 .
[18] Stephen A. Jarvis,et al. Accelerating Hydrocodes with OpenACC, OpenCL and CUDA , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[19] Timothy G. Mattson,et al. OpenCL Programming Guide , 2011 .
[20] Bronis R. de Supinski,et al. Early Experiences with the OpenMP Accelerator Model , 2013, IWOMP.
[21] Simon McIntosh-Smith,et al. On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures , 2014, ISC.
[22] J. A. Herdman,et al. Performance Analysis of a High-Level Abstractions-Based Hydrocode on Future Computing Systems , 2014, PMBS@SC.
[23] Kevin O'Brien,et al. Performance analysis of OpenMP on a GPU using a CORAL proxy application , 2015, PMBS '15.
[24] John Shalf,et al. Exascale Computing Trends: Adjusting to the "New Normal"' for Computer Architecture , 2013, Computing in Science & Engineering.
[25] David A. Padua,et al. Performance Portability with the Chapel Language , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[26] Wu-chun Feng,et al. CU2CL: A CUDA-to-OpenCL Translator for Multi- and Many-Core Architectures , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.
[27] Alistair Hart. First Experiences Porting a Parallel Application to a Hybrid Supercomputer with OpenMP4.0 Device Constructs , 2015, IWOMP.
[28] Malcolm Atkinson,et al. High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: , 2012 .