论文信息 - Kepler-architecture based CUDA (compute unified device architecture) runtime parameter transparent-optimization method

Kepler-architecture based CUDA (compute unified device architecture) runtime parameter transparent-optimization method

An embodiment of the invention provides a Kepler-architecture based CUDA (compute unified device architecture) runtime parameter transparent-optimization method and relates to the technical field of CUDA programming. Via the Kepler-architecture based CUDA runtime parameter transparent-optimization method, time for acquiring performance-optimized configuration runtime parameter is saved for kernel function. The Kepler-architecture based CUDA runtime parameter transparent-optimization method includes unpackaging packaged calling requests transmitted by an intercepting end by a backstage service end, and acquiring runtime parameter information of the kernel function; calculating the sum of required threads of the kernel function by the backstage service end according to the runtime parameter information of the kernel function, and consequently determining class of a subordinate thread count; rectifying size of tread blocks according to the determined class, and calculating the number of the rectified thread blocks and the capacity of the shared memory after rectification; finally, transmitting the rectified runtime parameters of the kernel function and the execution part of the kernel function to implement by a CUDA runtime layer of the backstage service end through a backend server.

杨刚 | 张策 | 王严 | 杜三盛