dJay: enabling high-density multi-tenancy for cloud gaming servers with dynamic cost-benefit GPU load balancing

In cloud gaming, servers perform remote rendering on behalf of thin clients. Such a server must deliver sufficient frame rate (at least 30fps) to each of its clients. At the same time, each client desires an immersive experience, and therefore the server should also provide the best graphics quality possible to each client. Statically provisioning time slices of the server GPU for each client suffers from severe underutilization because clients can come and go, and scenes that the clients need rendered can vary greatly in terms of GPU resource usage over time. In this work, we present dJay, a utility-maximizing cloud gaming server that dynamically tunes client GPU rendering workloads in order to 1) ensure all clients get satisfactory frame rate, and 2) provide the best possible graphics quality across clients. To accomplish this, we develop three main components. First, we build an online profiler that collects key cost and benefit data, and distills the data into a reusable regression model. Second, we build an online utility optimizer that uses the regression model to tune GPU workloads for better graphics quality. The optimizer solves the Multiple Choice Knapsack problem. We demonstrate dJay on two high quality commercial games, Doom 3 and Fable 3. Our results show that when compared to a static configuration, we can respond much better to peaks and troughs, achieving up to four times the multi-tenant density on a single server while offering clients the best possible graphics quality.

[1]  Deeparnab Chakrabarty,et al.  Knapsack Problems , 2008 .

[2]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[3]  Daniel Price,et al.  Solaris Zones: Operating System Support for Consolidating Commercial Workloads , 2004, LISA.

[4]  Alec Wolman,et al.  Outatime: Using Speculation to Enable Low-Latency Continuous Interaction for Mobile Cloud Gaming , 2015, MobiSys.

[5]  Alec Wolman,et al.  Centrifuge: Integrated Lease Management and Partitioning for Cloud Services , 2010, NSDI.

[6]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[7]  Srikanth Kandula,et al.  Multi-resource packing for cluster schedulers , 2014, SIGCOMM.

[8]  Yaozu Dong,et al.  A Full GPU Virtualization Solution with Mediated Pass-Through , 2014, USENIX Annual Technical Conference.

[9]  Leonard McMillan,et al.  Post-rendering 3D warping , 1997, SI3D.

[10]  Mark. Deloura,et al.  Game Programming Gems , 2000 .

[11]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[12]  Donald E. Porter,et al.  Rethinking the library OS from the top down , 2011, ASPLOS XVI.

[13]  David R. Karger,et al.  Web Caching with Consistent Hashing , 1999, Comput. Networks.

[14]  Greg Humphreys,et al.  Chromium: a stream-processing framework for interactive rendering on clusters , 2002, SIGGRAPH.

[15]  Alec Wolman,et al.  Outatime: Using Speculation to Enable Low-Latency Continuous Interaction for Cloud Gaming , 2014 .

[16]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[17]  Jeremy Sugerman,et al.  GPU virtualization on VMware's hosted I/O architecture , 2008, OPSR.

[18]  Eyal de Lara,et al.  VMM-independent graphics acceleration , 2007, VEE '07.

[19]  Kajal T. Claypool,et al.  The effects of frame rate and resolution on users playing first person shooter games , 2006, Electronic Imaging.

[20]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[21]  Alec Wolman,et al.  Demo: Kahawai: high-quality mobile gaming using GPU offload , 2015, MobiSys.

[22]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .