Time-constrained and network-aware containers scheduling in GPU era

Abstract The recent advances on data center management and applications development are reflected by lightweight containers technology and critical Quality-of-Service (QoS) requirements. Tenants encapsulate applications in containers abstracting away details on the infrastructure, and entrust its management framework with the provisioning of network and time QOS requirements. In this paper, we addressed this NP-hard scheduling problem proposing a GPU Accelerated Containers Scheduler (GPUACS). We model the joint allocation of network and containers with QoS requirements as a graph embedding problem. GPUACS innovates by refactoring two Multicriteria Decision Makings (MCDMs) to GPU model, as well as by defining an efficient data structure to speed up the comparison of time-evolving QoS requirements. GPUACS follows a modular and configurable architecture, and the scheduling objective function can be adjusted by selecting the MCDM method and setting the appropriated weights to guide the comparisons. An experimental analysis demonstrated the sensitivity that GPU-tailored MCDM methods have to schedule container requests considering critical time, network, and processing criteria, as well as multiple queueing policies.

[1]  Marcos Dias de Assunção,et al.  QVIA-SDN: Towards QoS-Aware Virtual Infrastructure Allocation on SDN-based Clouds , 2019, Journal of Grid Computing.

[2]  Ching-Lai Hwang,et al.  Methods for Multiple Attribute Decision Making , 1981 .

[3]  Matthias Rost,et al.  Parametrized complexity of virtual network embeddings: dynamic & linear programming approximations , 2019, CCRV.

[4]  Matthias Rost,et al.  Beyond the Stars: Revisiting Virtual Cluster Embeddings , 2015, CCRV.

[5]  Thomas L. Saaty,et al.  Models, Methods, Concepts & Applications of the Analytic Hierarchy Process , 2012 .

[6]  S. Dongen Graph clustering by flow simulation , 2000 .

[7]  VahdatAmin,et al.  A scalable, commodity data center network architecture , 2008 .

[8]  Zhijun Wang,et al.  Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler , 2019, SoCC.

[9]  Yong Zhao,et al.  An Analysis and Empirical Study of Container Networks , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[10]  Carlos Juiz,et al.  Genetic Algorithm for Multi-Objective Optimization of Container Allocation in Cloud Architecture , 2017, Journal of Grid Computing.

[11]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[12]  Mansour Sheikhan,et al.  Time series prediction using PSO-optimized neural network and hybrid feature selection algorithm for IEEE load data , 2012, Neural Computing and Applications.

[13]  Guilherme Piegas Koslovski,et al.  GPU-Accelerated Algorithms for Allocating Virtual Infrastructure in Cloud Data Centers , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[14]  Said Ben Alla,et al.  An Efficient Dynamic Priority-Queue Algorithm Based on AHP and PSO for Task Scheduling in Cloud Computing , 2016, HIS.

[15]  Yang Hu,et al.  Concurrent container scheduling on heterogeneous clusters with multi-resource constraints , 2020, Future Gener. Comput. Syst..

[16]  Kunwar Singh Vaisla,et al.  TOPSIS–PSO inspired non-preemptive tasks scheduling algorithm in cloud environment , 2019, Cluster Computing.

[17]  Wenbin Yao,et al.  A container scheduling strategy based on neighborhood division in micro service , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[18]  Valerio Schiavoni,et al.  GENPACK: A Generational Scheduler for Cloud Data Centers , 2017, 2017 IEEE International Conference on Cloud Engineering (IC2E).

[19]  Imtiaz Ahmad,et al.  Optimizing scheduling decisions of container management tool using many‐objective genetic algorithm , 2020, Concurr. Comput. Pract. Exp..

[20]  Abhishek Verma,et al.  Large-scale cluster management at Google with Borg , 2015, EuroSys.

[21]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[22]  Guilherme Piegas Koslovski,et al.  Executing distributed applications on SDN-based Data Center: A study with NAS Parallel Benchmark , 2016, 2016 7th International Conference on the Network of the Future (NOF).

[23]  Danilo Carastan-Santos,et al.  One Can Only Gain by Replacing EASY Backfilling: A Simple Scheduling Policies Case Study , 2019, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[24]  Chen-Tung Chen,et al.  Extensions of the TOPSIS for group decision-making under fuzzy environment , 2000, Fuzzy Sets Syst..

[25]  Guilherme Piegas Koslovski,et al.  QoS-Aware Virtual Infrastructures Allocation on SDN-Based Clouds , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[26]  Guilherme Piegas Koslovski,et al.  Tackling Virtual Infrastructure Allocation in Cloud Data Centers: a GPU-Accelerated Framework , 2018, 2018 14th International Conference on Network and Service Management (CNSM).

[27]  Guilherme Piegas Koslovski,et al.  DeepScheduling: Grid Computing Job Scheduler Based on Deep Reinforcement Learning , 2020, AINA.