Analyzing program flow within a many-kernel OpenCL application
暂无分享,去创建一个
David R. Kaeli | Kim M. Hazelwood | Perhaad Mistry | Norman Rubin | Chris Gregg | K. Hazelwood | D. Kaeli | Norman Rubin | Chris Gregg | Perhaad Mistry
[1] Allen D. Malony,et al. An experimental approach to performance measurement of heterogeneous parallel applications using CUDA , 2010, ICS '10.
[2] Serge J. Belongie,et al. SD-VBS: The San Diego Vision Benchmark Suite , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[3] Fei Su,et al. Face recognition using SURF features , 2009, International Symposium on Multispectral Image Processing and Pattern Recognition.
[4] Matthew A. Brown,et al. Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.
[5] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.
[6] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..
[7] Christopher J. Hughes,et al. Computer Vision on Multi-Core Processors: Articulated Body Tracking , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[8] Zhen Fang,et al. Performance characterization and optimization of mobile augmented reality on handheld platforms , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[9] Shirley Moore,et al. Continuous Runtime Profiling of OpenMP Applications , 2007, PARCO.
[10] Hubert Nguyen,et al. GPU Gems 3 , 2007 .
[11] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .
[12] Mark J. Harris,et al. Parallel Prefix Sum (Scan) with CUDA , 2011 .
[13] Chi Hay Tong,et al. ECE 1724 Project Speeded-Up Speeded-Up Robust Features , 2009 .
[14] Budirijanto Purnomo,et al. ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs , 2010, SIGGRAPH '10.
[15] Martin C. Herbordt,et al. GPU acceleration of a production molecular docking code , 2009, GPGPU-2.
[16] Ray W. Grout,et al. Accelerating S3D: A GPGPU Case Study , 2009, Euro-Par Workshops.
[17] Amy Apon,et al. Accelerating Image Feature Comparisons using CUDA on Commodity Hardware , 2010, HiPC 2010.
[18] Lance M. Berc,et al. Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..
[19] Nan Zhang,et al. Computing Optimised Parallel Speeded-Up Robust Features (P-SURF) on Multi-Core Processors , 2010, International Journal of Parallel Programming.
[20] Grigori Fursin,et al. Predictive Runtime Code Scheduling for Heterogeneous Architectures , 2008, HiPEAC.
[21] Jun Luo,et al. Person-Specific SIFT Features for Face Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.