Inferring the Scheduling Policies of an Embedded CUDA GPU

Embedded systems augmented with graphics processing units (GPUs) are seeing increased use in safety-critical real-time systems such as autonomous vehicles. Due to monetary cost requirements along with size, weight, and power (SWaP) constraints, embedded GPUs are often computationally impoverished compared to those used in non-embedded systems. In order to maximize performance on these impoverished GPUs, we examine co-scheduling: allowing multiple applications concurrent access to a GPU. In this work, we use a new benchmarking framework to examine internal scheduling policies of the black-box hardware and software used to co-schedule GPU tasks on the NVIDIA Jetson TX1.

[1]  Shinpei Kato,et al.  RGEM: A Responsive GPGPU Execution Model for Runtime Engines , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[2]  Avi Mendelson,et al.  Scheduling processing of real-time data streams on heterogeneous multi-GPU systems , 2012, SYSTOR '12.

[3]  Kyoung-Don Kang,et al.  Supporting Preemptive Task Executions and Memory Copies in GPGPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[4]  Avi Mendelson,et al.  Scheduling periodic real-time communication in multi-GPU systems , 2014, 2014 23rd International Conference on Computer Communication and Networks (ICCCN).

[5]  Jianlong Zhong,et al.  Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling , 2013, IEEE Transactions on Parallel and Distributed Systems.

[6]  Depei Qian,et al.  Scheduling Tasks with Mixed Timing Constraints in GPU-Powered Real-Time Systems , 2016, ICS.