Using Kernel Coupling to Improve the Performance of Multithreaded Applications

Kernel coupling refers to the effect that kernel i has on kernel j in relation to running each kernel in isolation. The two kernels can correspond to adjacent kernels or a chain of three or more kernels in the control flow of an application. In previous work, we used kernel coupling to provide insights on where further algorithm and code implementation work was needed to improve performance, in particular the reuse of data between kernels. Further, coupling was used to develop analytical models of applications as a composition of the models of the kernels that make-up the application. In contrast, this paper extends the coupling concept to provide scaling in addition to coupling information about multithreaded application. We illustrate the use of the extended coupling parameter with two case studies, focused on the LU and Radix Sort of Splash-2 benchmarks executed on the Cray MTA. The results indicate up to 55% decrease in phantoms (or NOPs) and 14% decrease in execution time.