Sesame: A User-Transparent Optimizing Framework for Many-Core Processors
暂无分享,去创建一个
[1] JinHaoqiang,et al. Performance characteristics of the multi-zone NAS parallel benchmarks , 2006 .
[2] Jianbin Fang,et al. An Auto-tuning Solution to Data Streams Clustering in OpenCL , 2011, 2011 14th IEEE International Conference on Computational Science and Engineering.
[3] Henk Sips,et al. Source-to-Source Vectorization for OpenCL Kernels , 2012 .
[4] Jianbin Fang,et al. A Comprehensive Performance Comparison of CUDA and OpenCL , 2011, 2011 International Conference on Parallel Processing.
[5] Jianbin Fang,et al. Memory Access Patterns on Architectures with Local Memory: A Performance Database , 2012 .
[6] Haoqiang Jin,et al. Performance characteristics of the multi-zone NAS parallel benchmarks , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[7] Jie Shen,et al. Performance Gaps between OpenMP and OpenCL for Multi-core CPUs , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[8] Jie Shen,et al. Accelerating Cost Aggregation for Real-Time Stereo Matching , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.
[9] Kunle Olukotun,et al. A domain-specific approach to heterogeneous parallelism , 2011, PPoPP '11.
[10] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[11] Henri E. Bal,et al. Towards an Effective Unified Programming Model for Many-Cores , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[12] Alejandro Duran,et al. Productive Programming of GPU Clusters with OmpSs , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[13] Yao Zhang,et al. Parallel Computing Experiences with CUDA , 2008, IEEE Micro.
[14] Seyong Lee,et al. Early evaluation of directive-based GPU programming models for productive exascale computing , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[15] Jie Shen,et al. ELMO: A User-Friendly API to Enable Local Memory in OpenCL Kernels , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[16] Kunle Olukotun,et al. Implementing Domain-Specific Languages for Heterogeneous Parallel Computing , 2011, IEEE Micro.
[17] J. Xu. OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .