A server-based approach for predictable GPU access control

We propose a server-based approach to manage a general-purpose graphics processing unit (GPU) in a predictable and efficient manner. Our proposed approach introduces a GPU server task that is dedicated to handling GPU requests from other tasks on their behalf. The GPU server ensures bounded time to access the GPU, and allows other tasks to suspend during their GPU computation to save CPU cycles. By doing so, we address the two major limitations of the existing real-time synchronization-based GPU management approach: busy waiting and long priority inversion. We implemented a prototype of the server-based approach on a real embedded platform. This case study demonstrates the practicality and effectiveness of the server-based approach. Experimental results indicate that the server-based approach yields significant improvements in task schedulability over the existing synchronization-based approach in most practical settings. Although we focus on a GPU in this paper, the server-based approach can also be used for other types of computational accelerators.

[1]  Young-Woo Seo,et al.  Kernel-based traffic sign tracking to improve highway workzone recognition for reliable autonomous driving , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[2]  Ragunathan Rajkumar,et al.  Coordinated Task Scheduling, Allocation and Synchronization on Multiprocessors , 2009, 2009 30th IEEE Real-Time Systems Symposium.

[3]  Shinpei Kato,et al.  Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.

[4]  Shinpei Kato,et al.  RGEM: A Responsive GPGPU Execution Model for Runtime Engines , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[5]  Ragunathan Rajkumar,et al.  Towards a viable autonomous driving research platform , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[6]  Hennadiy Leontyev,et al.  A Flexible Real-Time Locking Protocol for Multiprocessors , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[7]  Ragunathan Rajkumar,et al.  Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car , 2013, 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS).

[8]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[9]  James H. Anderson,et al.  An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems , 2012, Real-Time Systems.

[10]  Jian-Jia Chen,et al.  Many suspensions, many problems: a review of self-suspending tasks in real-time systems , 2018, Real-Time Systems.

[11]  Ming Yang,et al.  An Evaluation of the NVIDIA TX1 for Supporting Real-Time Computer-Vision Workloads , 2017, 2017 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[12]  Björn B. Brandenburg The FMLP+: An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[13]  James H. Anderson,et al.  Globally scheduled real-time multiprocessor systems with GPUs , 2011, Real-Time Systems.

[14]  Lui Sha,et al.  Real-time synchronization protocols for multiprocessors , 1988, Proceedings. Real-Time Systems Symposium.

[15]  Björn Andersson,et al.  Segment-Fixed Priority Scheduling for Self-Suspending Real-Time Tasks , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[16]  Jian-Jia Chen,et al.  Errata for Three Papers (2004-05) on Fixed-Priority Scheduling with Self-Suspensions , 2018, Leibniz Trans. Embed. Syst..

[17]  Shinpei Kato,et al.  TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.

[18]  John P. Lehoczky,et al.  Partitioned Fixed-Priority Preemptive Scheduling for Multi-core Processors , 2009, 2009 21st Euromicro Conference on Real-Time Systems.

[19]  James H. Anderson,et al.  GPUSync: A Framework for Real-Time GPU Management , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[20]  James H. Anderson,et al.  The OMLP family of optimal multiprocessor real-time locking protocols , 2013, Des. Autom. Embed. Syst..

[21]  Ragunathan Rajkumar,et al.  Linux/RK: A Portable Resource Kernel in Linux , 2005 .

[22]  Cong Liu,et al.  GPES: a preemptive execution system for GPGPU computing , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[23]  Konstantinos Bletsas,et al.  Realistic analysis of limited parallel software/hardware implementations , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[24]  Mateo Valero,et al.  Enabling preemptive multiprogramming on GPUs , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[25]  R. Rajkumar Real-time synchronization protocols for shared memory multiprocessors , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.