Automated GPU Grid Geometry Selection for OPENMP Kernels
暂无分享,去创建一个
[1] Yi Yang,et al. Warp-level divergence in GPUs: Characterization, impact, and mitigation , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[2] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[3] Rudolf Eigenmann,et al. OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[4] Michael F. P. O'Boyle,et al. Portable mapping of data parallel programs to OpenCL for heterogeneous systems , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[5] Bo Joel Svensson,et al. Meta-programming and auto-tuning in the search for high performance GPU code , 2015, FHPC@ICFP.
[6] Matthew E. Taylor,et al. Feature selection and policy optimization for distributed instruction placement using reinforcement learning , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[7] Lifan Xu,et al. Auto-tuning a high-level language targeted to GPU codes , 2012, 2012 Innovative Parallel Computing (InPar).
[8] Michael F. P. O'Boyle,et al. Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.
[9] Michael F. P. O'Boyle,et al. Mapping parallelism to multi-cores: a machine learning based approach , 2009, PPoPP '09.
[10] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[11] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[12] Sven-Bodo Scholz,et al. Unibench: A Tool for Automated and Collaborative Benchmarking , 2010, 2010 IEEE 18th International Conference on Program Comprehension.
[13] Kevin O'Brien,et al. Integrating GPU support for OpenMP offloading directives into Clang , 2015, LLVM '15.