Machine Learning-based Interference Detection in GPGPU Concurrent Kernel Execution

Recent advancements in GPU architectures have made it possible to run multiple kernels concurrently on a single GPU, to avoid under-utilization of its resources. Fine-grain sharing of streaming multiprocessors (SMs) allows thread blocks of multiple kernels to be assigned to GPU resources altogether. However, this may cause resource contention and performance degradation if both kernels try to access a shared resource at the same time. Detecting these interferences is essential especially in high-performance computing (HPC) systems, in which multiple applications may issue different kernels to available shared GPUs. This paper proposes a machine learning-based approach to characterize kernels and predict interference before their concurrent execution. Random forest classifier is used to classify interfering and noninterfering kernels. Experimental results show that the proposed method can detect interfering kernels with up to 91.7% accuracy.

[1]  Nam Sung Kim,et al.  The case for GPGPU spatial multitasking , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[2]  Holger Fröning,et al.  Metric Selection for GPU Kernel Classification , 2019, ACM Trans. Archit. Code Optim..

[3]  Rami G. Melhem,et al.  Simultaneous Multikernel GPU: Multi-tasking throughput processors via fine-grained sharing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[4]  Xiangyu Li,et al.  Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[5]  Won Woo Ro,et al.  Warped-Slicer: Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU Multiprogramming , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[6]  Minyi Guo,et al.  Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Leiming Yu,et al.  Multilevel Interference-aware Scheduling On Modern Gpus , 2019 .