VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming

Fueled by the maturity of virtualization technology for Graphics Processing Unit (GPU), there is an increasing number of data centers dedicated to GPU-related computation tasks in cloud gaming. However, GPU resource sharing in these applications is usually poor. This stems from the fact that the typical cloud gaming service providers often allocate one GPU exclusively for one game. To achieve the efficiency of computational resource management, there is a demand for cloud computing to employ the multi-task scheduling technologies to improve the utilization of GPU. In this paper, we propose VGRIS, a resource management framework for Virtualized GPU Resource Isolation and Scheduling in cloud gaming. By leveraging the mature GPU paravirtualization architecture, VGRIS resides in the host through library API interception, while the guest OS and the GPU computing applications remain unmodified. In the proposed framework, we implemented three scheduling algorithms in VGRIS for different objectives, i.e., Service Level Agreement (SLA)-aware scheduling, proportional-share scheduling, and hybrid scheduling that mixes the former two. By designing such a scheduling framework, it is possible to handle different kinds of GPU computation tasks for different purposes in cloud gaming. Our experimental results show that each scheduling algorithm can achieve its goals under various workloads.

[1]  Yi Yang,et al.  A unified optimizing compiler framework for different GPGPU architectures , 2012, TACO.

[2]  Srimat T. Chakradhar,et al.  A virtual memory based runtime to support multi-tenancy in clusters with GPUs , 2012, HPDC '12.

[3]  Chia-Lin Yang,et al.  Power gating strategies on GPUs , 2011, TACO.

[4]  Ryutaro Himeno,et al.  Automatic Resource Scheduling with Latency Hiding for Parallel Stencil Applications on GPGPU Clusters , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[5]  Srimat T. Chakradhar,et al.  Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework , 2011, HPDC '11.

[6]  James H. Anderson,et al.  GPUSync: A Framework for Real-Time GPU Management , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[7]  Daniel Cohen-Or,et al.  Streaming Scenes to MPEG-4 Video-Enabled Devices , 2003, IEEE Computer Graphics and Applications.

[8]  Filip De Turck,et al.  A hybrid thin-client protocol for multimedia streaming and interactive gaming applications , 2006, NOSSDAV '06.

[9]  Xingjian Li,et al.  Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system , 2011, HPDC '11.

[10]  Shinpei Kato,et al.  TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.

[11]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[12]  Shang Gao,et al.  Real-time Enhancement for Xen Hypervisor , 2010, 2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[13]  Avi Mendelson,et al.  Exploring the limits of GPGPU scheduling in control flow bound applications , 2012, TACO.

[14]  Shinpei Kato,et al.  Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.

[15]  Shinpei Kato,et al.  Resource Sharing in GPU-Accelerated Windowing Systems , 2011, 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium.

[16]  Federico Silla,et al.  An Efficient Implementation of GPU Virtualization in High Performance Clusters , 2009, Euro-Par Workshops.

[17]  Arif Merchant,et al.  Proportional-Share Scheduling for Distributed Storage Systems , 2007, FAST.

[18]  Mark Silberstein,et al.  PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.

[19]  Jeremy Sugerman,et al.  GPU virtualization on VMware's hosted I/O architecture , 2008, OPSR.

[20]  Mikhail Bautin,et al.  Graphic engine resource management , 2008, Electronic Imaging.

[21]  Navjot Singh,et al.  Supporting soft real-time tasks in the xen hypervisor , 2010, VEE '10.

[22]  Long Chen,et al.  Dynamic load balancing on single- and multi-GPU systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[23]  Cong Xu,et al.  vSlicer: latency-aware virtual machine scheduling via differentiated-frequency CPU slicing , 2012, HPDC '12.

[24]  Vishakha Gupta,et al.  Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies , 2011, VTDC '11.

[25]  Henri Casanova,et al.  Virtual Machine Resource Allocation for Service Hosting on Heterogeneous Distributed Platforms , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[26]  Federico Silla,et al.  rCUDA: Reducing the number of GPU-based accelerators in high performance clusters , 2010, 2010 International Conference on High Performance Computing & Simulation.

[27]  Eyal de Lara,et al.  VMM-independent graphics acceleration , 2007, VEE '07.

[28]  Vanish Talwar,et al.  Pegasus: Coordinated Scheduling for Virtualized Accelerator-based Systems , 2011, USENIX Annual Technical Conference.

[29]  Lin Shi,et al.  vCUDA: GPU accelerated high performance computing in virtual machines , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[30]  Murray Cole,et al.  PARTANS: An autotuning framework for stencil computation on multi-GPU systems , 2013, TACO.

[31]  David R. Cheriton,et al.  Borrowed-virtual-time (BVT) scheduling: supporting latency-sensitive threads in a general-purpose scheduler , 1999, SOSP.

[32]  Srihari Cadambi,et al.  Interference-driven resource management for GPU-based heterogeneous clusters , 2012, HPDC '12.

[33]  A Thesis Presented,et al.  A Fair-Share Scheduler for the Graphics Processing Unit , 2008 .

[34]  Alessandro De Gloria,et al.  Platform for Distributed 3D Gaming , 2009, Int. J. Comput. Games Technol..

[35]  Anand Sivasubramaniam,et al.  Storage performance virtualization via throughput and latency control , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[36]  Renato Pajarola,et al.  REAL-TIME 3 D GRAPHICS STREAMING USING MPEG 4 , 2004 .

[37]  Ada Gavrilovska,et al.  Differential virtual time (DVT): rethinking I/O service differentiation for virtual machines , 2010, SoCC '10.

[38]  Amin Vahdat,et al.  Dynamic Scheduling of Virtual Machines Running HPC Workloads in Scientific Grids , 2007, 2009 3rd International Conference on New Technologies, Mobility and Security.

[39]  Chao Zhang,et al.  VGRIS: Virtualized GPU Resource Isolation and Scheduling in Cloud Gaming , 2014, ACM Trans. Archit. Code Optim..

[40]  Vanish Talwar,et al.  GViM: GPU-accelerated virtual machines , 2009, HPCVirt '09.

[41]  James H. Anderson,et al.  Globally scheduled real-time multiprocessor systems with GPUs , 2011, Real-Time Systems.