Detection of Service Provider Hardware Over-commitment in Container Orchestration Environments

The deployment of container-based services continues to increase as time passes, mainly due to its fast provision time and lower allocation overheads. Yet, the literature still neglects the performance degradation in containers due to multi-tenancy and service provider hardware over-commitment. This paper proposes a new hardware over-commitment detection for container orchestration environments, implemented twofold. First, the containerized hardware usage of deployed containers is continuously monitored in a non-intrusive manner, leveraging the container engine resource management interface. Second, collected features are used by a recurrent neural network model for detecting both container and service level hardware over-commitment, following a time-series rationale. Experiments run on a containerized Apache Spark distribution have shown that multi-tenancy and hardware over-commitment significantly affect its performance. In addition, our proposed model is able to detect hardware over-commitment with up to 91% of true-positive at the container level, and up to 93% true-positive at the service level.

[1]  Eduardo K. Viegas,et al.  Toward feasible machine learning model updates in network-based intrusion detection , 2021, Comput. Networks.

[2]  Steve A. Adeshina,et al.  Machine Learning-Based Anomalies Detection in Cloud Virtual Machine Resource Usage , 2021, 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS).

[3]  Eduardo K. Viegas,et al.  A Machine Learning Model for Detection of Docker-based APP Overbooking on Kubernetes , 2021, ICC 2021 - IEEE International Conference on Communications.

[4]  Krzysztof Rzadca,et al.  Take it to the limit: peak prediction-driven resource overcommitment in datacenters , 2021, EuroSys.

[5]  Dimitrios Soudris,et al.  Rusty: Runtime Interference-Aware Predictive Monitoring for Modern Multi-Tenant Systems , 2021, IEEE Transactions on Parallel and Distributed Systems.

[6]  Altair Olivo Santin,et al.  Machine Learning Intrusion Detection in Big Data Era: A Multi-Objective Approach for Longer Model Lifespans , 2021, IEEE Transactions on Network Science and Engineering.

[7]  Yiping Yao,et al.  Dynamic Resource Prediction in Cloud Computing for Complex System Simulatiuon: A Probabilistic Approach Using Stacking Ensemble Learning , 2020, 2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI).

[8]  Altair Olivo Santin,et al.  Enhancing service maintainability by monitoring and auditing SLA in cloud computing , 2020, Cluster Computing.

[9]  Altair O. Santin,et al.  A Host-based Intrusion Detection Model Based on OS Diversity for SCADA , 2020, IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society.

[10]  Chih-Hao Huang,et al.  Hardware Resource Reliability Analysis based on Deep Learning for Virtual Machine Deployment Optimization , 2020, 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE).

[11]  Vilmar Abreu,et al.  A Reliable Semi-Supervised Intrusion Detection Model: One Year of Network Traffic Anomalies , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[12]  Altair Olivo Santin,et al.  Identity and Access Management for IoT in Smart Grid , 2020, AINA.

[13]  Michael Gerndt,et al.  Maintaining SLOs of Cloud-Native Applications Via Self-Adaptive Resource Sharing , 2019, 2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO).

[14]  Tassos Dimitriou,et al.  Container Security: Issues, Challenges, and the Road Ahead , 2019, IEEE Access.

[15]  Michel Dagenais,et al.  VM processes state detection by hypervisor tracing , 2018, 2018 Annual IEEE International Systems Conference (SysCon).

[16]  Richard O. Sinnott,et al.  A performance comparison of container-based technologies for the Cloud , 2017, Future Gener. Comput. Syst..

[17]  Mohamed Mohamed,et al.  rSLA: A Service Level Agreement Language for Cloud Services , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[18]  Ofer Biran,et al.  VM Placement Strategies for Cloud Scenarios , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[19]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[20]  João Pedro Dias,et al.  Developing Docker and Docker-Compose Specifications: A Developers’ Survey , 2022, IEEE Access.