Proactive Live Migration for Virtual Network Functions using Machine Learning

VM (Virtual Machine) live migration is a server virtualization technique for deploying a running VM to another server node while minimizing downtime of service the VM provides. Currently, in cloud data centers, VM live migration is widely used to apply load balancing on CPU workload and network traffic, to reduce electricity consumption, and to provide uninterrupted service during the maintenance of hardware and software updates on servers. It is critical to use VM live migration as a prevention or mitigation measure for possible failure when its indications are detected or predicted. Especially in NFV (Network Function Virtualization) environment, timely use of VNF (Virtual Network Function) live migration can maintain system availability and reduce operator's loss due to service failure. In this paper, we propose a proactive live migration method for vEPC (Virtual Evolved Packet Core) based on failure prediction. A machine learning model learns periodic monitoring data of resource usage and logs from servers and VMs/VNFs to predict future vEPC paging failure probability. We implemented the proposed method in OpenStack-based NFV environment to evaluate the real service performance gains for open source vEPC implementations.

[1]  Song Guo,et al.  Proactive Failure Recovery for NFV in Distributed Edge Computing , 2019, IEEE Communications Magazine.

[2]  Bin Han,et al.  AI-Empowered VNF Migration as a Cost-Loss-Effective Solution for Network Resilience , 2021, 2021 IEEE Wireless Communications and Networking Conference Workshops (WCNCW).

[3]  Ben Y. Zhao,et al.  Predictive Analysis in Network Function Virtualization , 2018, Internet Measurement Conference.

[4]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[5]  Hee-Gon Kim,et al.  A Network Intelligence Architecture for Efficient VNF Lifecycle Management , 2021, IEEE Transactions on Network and Service Management.

[6]  Songwu Lu,et al.  vEPC-sec: Securing LTE Network Functions Virtualization on Public Cloud , 2019, IEEE Transactions on Information Forensics and Security.

[7]  Elisa Bertino,et al.  LTEInspector: A Systematic Approach for Adversarial Testing of 4G LTE , 2018, NDSS.

[8]  Dongmei Zhang,et al.  Predicting Node failure in cloud service systems , 2018, ESEC/SIGSOFT FSE.

[9]  Jae-Hyoung Yoo,et al.  Deep Q-Networks based Auto-scaling for Service Function Chaining , 2020, 2020 16th International Conference on Network and Service Management (CNSM).