KubCG: A dynamic Kubernetes scheduler for heterogeneous clusters

Container platforms are increasingly being used to deploy cloud‐based services. Nevertheless, many cloud services are also demanding graphics processing units (GPUs) to accelerate different applications that make use of their parallel architecture, such as deep learning or just video processing. Thus, different container technologies, such as Docker and Kubernetes, are implementing GPU support. Some effort is being devoted to design algorithms to schedule applications into heterogeneous computing systems that use CPUs and GPUs together. This article is part of this effort, and we describe how to build a dynamic scheduling platform for Kubernetes that is able to manage the deployment of Docker containers in a heterogeneous cluster, which we call KubCG. This platform implements a new scheduler that optimizes the deployment of new containers by taking into account the Kubernetes Pod timeline and the historical information about the execution of the containers. We have performed different tests to validate this new algorithm, and KubCG was able to reduce the time to complete different tasks down to a 64% of the original time in our different experiments.

[1]  Matti Siekkinen,et al.  Virtual machines vs. containers in cloud gaming systems , 2015, 2015 International Workshop on Network and Systems Support for Games (NetGames).

[2]  Michael F. P. O'Boyle,et al.  Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[3]  Kevin Skadron,et al.  Dynamic Heterogeneous Scheduling Decisions Using Historical Runtime Data , 2011 .

[4]  Rajkumar Buyya,et al.  Container‐based cluster orchestration systems: A taxonomy and future directions , 2018, Softw. Pract. Exp..

[5]  Omer F. Rana,et al.  Modelling Performance & Resource Management in Kubernetes , 2016, 2016 IEEE/ACM 9th International Conference on Utility and Cloud Computing (UCC).

[6]  Lin Shi,et al.  vCUDA: GPU accelerated high performance computing in virtual machines , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[7]  Zdenek Becvar,et al.  Mobile Edge Computing: A Survey on Architecture and Computation Offloading , 2017, IEEE Communications Surveys & Tutorials.

[8]  Dimitrios S. Nikolopoulos,et al.  GPU Virtualization and Scheduling Methods , 2017, ACM Computing Surveys.

[9]  Jong-Myon Kim,et al.  An efficient scheduling scheme using estimated execution time for heterogeneous computing systems , 2013, The Journal of Supercomputing.

[10]  Grigori Fursin,et al.  Predictive Runtime Code Scheduling for Heterogeneous Architectures , 2008, HiPEAC.

[11]  David Bernstein,et al.  Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.

[12]  Srihari Cadambi,et al.  Symphony: A Scheduler for Client-Server Applications on Coprocessor-Based Heterogeneous Clusters , 2011, 2011 IEEE International Conference on Cluster Computing.

[13]  Bing Su,et al.  Multitask Oriented GPU Resource Sharing and Virtualization in Cloud Environment , 2015, ICA3PP.

[14]  Cheol-Ho Hong,et al.  GPU Virtualization and Scheduling Methods , 2017, ACM Comput. Surv..

[15]  Paramvir Bahl,et al.  Real-Time Video Analytics: The Killer App for Edge Computing , 2017, Computer.

[16]  M. M. Rovnyagin,et al.  The scheduling based on machine learning for heterogeneous CPU/GPU systems , 2016, 2016 IEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference (EIConRusNW).

[17]  Vanish Talwar,et al.  GViM: GPU-accelerated virtual machines , 2009, HPCVirt '09.

[18]  Federico Silla,et al.  rCUDA: Reducing the number of GPU-based accelerators in high performance clusters , 2010, 2010 International Conference on High Performance Computing & Simulation.

[19]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[20]  Walid Saad,et al.  A New Docker Swarm Scheduling Strategy , 2017, 2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2).

[21]  Ramakrishnan Rajamony,et al.  An updated performance comparison of virtual machines and Linux containers , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[22]  Jason Sanders,et al.  CUDA by example: an introduction to general purpose GPU programming , 2010 .

[23]  Omer F. Rana,et al.  Client-Side Scheduling Based on Application Characterization on Kubernetes , 2017, GECON.

[24]  Joel Antonio Trejo-Sánchez,et al.  A multi-agent architecture for scheduling of high performance services in a GPU cluster , 2018, Int. J. Comb. Optim. Probl. Informatics.

[25]  Mohamed Hefeeda,et al.  Dynamic Sharing of GPUs in Cloud Systems , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[26]  K. Chandrasekaran,et al.  Straddling the crevasse: A review of microservice software architecture foundations and recent advancements , 2019, Softw. Pract. Exp..

[27]  Paolo Bellavista,et al.  FogDocker: Start Container Now, Fetch Image Later , 2019, UCC.