On Applying Performance Portability Metrics
暂无分享,去创建一个
[1] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[2] Stephen A. Jarvis,et al. Achieving Performance Portability for a Heat Conduction Solver Mini-Application on Modern Multi-core Systems , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[3] Michael Frumkin,et al. Implementation of NAS Parallel Benchmarks in High Performance Fortran , 2000 .
[4] Sunita Chandrasekaran,et al. SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance , 2014, PMBS@SC.
[5] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .
[6] Victor W. Lee,et al. A Metric for Performance Portability , 2016, ArXiv.
[7] Jason Sewall,et al. Effective Performance Portability , 2018, 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC).
[8] Christoph W. Kessler,et al. Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption , 2017, ARMS-CC@PODC.
[9] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[10] Ana Lucia Varbanescu,et al. A Beginner's Guide to Estimating and Improving Performance Portability , 2018, ISC Workshops.
[11] Victor W. Lee,et al. Implications of a metric for performance portability , 2017, Future Gener. Comput. Syst..
[12] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[13] Ulrich Rüde,et al. Optimization and Profiling of the Cache Performance of Parallel Lattice Boltzmann Codes in 2 D and 3 D ∗ , 2003 .
[14] Wen-mei W. Hwu,et al. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .
[15] Daniel Sunderland,et al. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns , 2014, J. Parallel Distributed Comput..
[16] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[17] Justin P. Haldar,et al. Accelerating advanced MRI reconstructions on GPUs , 2008, J. Parallel Distributed Comput..
[18] Jeffrey C. Carver,et al. Parallel Programmer Productivity: A Case Study of Novice Parallel Programmers , 2005, ACM/IEEE SC 2005 Conference (SC'05).