Optimizing virtual backup allocation for middleboxes

In enterprise networks, network functions such as address translation, firewall and deep packet inspection are often implemented in middleboxes. Those can suffer from temporary unavailability due to misconfiguration or software and hardware malfunction. Traditionally, middlebox survivability is achieved by an expensive active-standby deployment where each middlebox has a backup instance, which is activated in case of a failure. Network Function Virtualization (NFV) is a novel networking paradigm allowing flexible, scalable and inexpensive implementation of network services. In this work we suggest a novel approach for planning and deploying backup schemes for network functions that guarantee high levels of survivability with significant reduction in resource consumption. In the suggested backup scheme we take advantage of the flexibility and resource-sharing abilities of the NFV paradigm in order to maintain only a few backup servers, where each can serve one of multiple functions when corresponding middleboxes are unavailable. We describe different goals that network designers can take into account when determining which functions to implement in each of the backup servers. We rely on a graph theoretical model to find properties of efficient assignments and to develop algorithms that can find them. Extensive experiments show, for example, that under realistic function failure probabilities, and reasonable capacity limitations, one can obtain 99.9% survival probability with half the number of servers, compared to standard techniques.

[1]  Isaac Keslassy,et al.  Maximizing the Throughput of Hash Tables in Network Devices with Combined SRAM/DRAM Memory , 2015, IEEE Transactions on Parallel and Distributed Systems.

[2]  Ori Rottenstreich,et al.  Optimizing virtual backup allocation for middleboxes , 2016, ICNP.

[3]  Ganesh Venkitachalam,et al.  The design of a practical system for fault-tolerant virtual machines , 2010, OPSR.

[4]  Richard E. Korf,et al.  A Hybrid Recursive Multi-Way Number Partitioning Algorithm , 2011, IJCAI.

[5]  Navendu Jain,et al.  Demystifying the dark side of the middle: a field study of middlebox failures in datacenters , 2013, Internet Measurement Conference.

[6]  Hani Jamjoom,et al.  Pico replication: a high availability framework for middleboxes , 2013, SoCC.

[7]  Alan M. Frieze,et al.  Maximum matchings in random bipartite graphs and the space utilization of Cuckoo Hash tables , 2009, Random Struct. Algorithms.

[8]  Vyas Sekar,et al.  Making middleboxes someone else's problem: network processing as a cloud service , 2012, SIGCOMM '12.

[9]  Filip De Turck,et al.  Customizable Function Chains: Managing Service Chain Variability in Hybrid NFV Networks , 2016, IEEE Transactions on Network and Service Management.

[10]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[11]  Nick Feamster,et al.  Concise Encoding of Flow Attributes in SDN Switches , 2017, SOSR.

[12]  Richard E. Korf,et al.  Multi-Way Number Partitioning , 2009, IJCAI.

[13]  Isaac Keslassy,et al.  Minimizing Delay in Network Function Virtualization with Shared Pipelines , 2017, IEEE Transactions on Parallel and Distributed Systems.

[14]  Paul Goransson,et al.  Network Functions Virtualization , 2017 .

[15]  Dutch T. Meyer,et al.  Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.

[16]  Joseph Naor,et al.  Near optimal placement of virtual network functions , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[17]  P. Hall On Representatives of Subsets , 1935 .

[18]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[19]  Michael D. Moffitt Search Strategies for Optimal Multi-Way Number Partitioning , 2013, IJCAI.

[20]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[21]  Reuven Cohen,et al.  An efficient approximation for the Generalized Assignment Problem , 2006, Inf. Process. Lett..

[22]  Richard E. Korf,et al.  Optimal Sequential Multi-Way Number Partitioning , 2014, ISAIM.

[23]  Luciana S. Buriol,et al.  Piecing together the NFV provisioning puzzle: Efficient placement and chaining of virtual network functions , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[24]  Haryadi S. Gunawi,et al.  Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages , 2016, SoCC.

[25]  Stephan Mertens The Easiest Hard Problem: Number Partitioning , 2006, Computational Complexity and Statistical Physics.

[26]  Aditya Akella,et al.  Toward software-defined middlebox networking , 2012, HotNets-XI.

[27]  Jian Li,et al.  COLO: COarse-grained LOck-stepping virtual machines for non-stop service , 2013, SoCC.

[28]  Dorit S. Hochbaum,et al.  Practical and theoretical improvements for bipartite matching using the pseudoflow algorithm , 2011, ArXiv.

[29]  Martin Stiemerling,et al.  Resilient deployment of virtual network functions , 2013, 2013 5th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT).

[30]  Jack Koziol Intrusion Detection with Snort , 2003 .

[31]  Scott Shenker,et al.  Rollback-Recovery for Middleboxes , 2015, Comput. Commun. Rev..