OpenCL Optimization and Best Practices for Qualcomm Adreno GPUs

As the industry's leading mobile graphics processing unit (GPU) core, Adreno™ in Qualcomm®'s Snapdragon™ SOCs has supported the OpenCL™ standard since its A3x family and all through its A4x, A5x families, and the latest A6x family. How to effectively program and optimize OpenCL applications on Adreno OpenCL is of great interest for many OEMs as well as 3rd party app developers. This paper provides a high level overview of Adreno's compute architecture, introduces Adreno's OpenCL support and general guidance and good practices on programming, optimization and profiling, and illustrates how to apply them and achieve good performance through two use case studies.