Dynamic scheduling of network updates

We present Dionysus, a system for fast, consistent network updates in software-defined networks. Dionysus encodes as a graph the consistency-related dependencies among updates at individual switches, and it then dynamically schedules these updates based on runtime differences in the update speeds of different switches. This dynamic scheduling is the key to its speed; prior update methods are slow because they pre-determine a schedule, which does not adapt to runtime conditions. Testbed experiments and data-driven simulations show that Dionysus improves the median update speed by 53--88% in both wide area and data center networks compared to prior methods.

[1]  David Johnson,et al.  Network architecture for joint failure recovery and traffic engineering , 2011, SIGMETRICS '11.

[2]  Olivier Bonaventure,et al.  Disruption Free Topology Reconfiguration in OSPF Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[3]  Xin Wu,et al.  zUpdate: updating data center networks with zero loss , 2013, SIGCOMM.

[4]  Nick Feamster,et al.  Design and implementation of a routing control platform , 2005, NSDI.

[5]  Sudipta Sengupta,et al.  Efficient and robust routing of highly variable traffic , 2005 .

[6]  Martín Casado,et al.  Onix: A Distributed Control Platform for Large-scale Production Networks , 2010, OSDI.

[7]  Robert Tappan Morris,et al.  Flexible, Wide-Area Storage for Distributed Systems with WheelFS , 2009, NSDI.

[8]  Martín Casado,et al.  Network Virtualization in Multi-tenant Datacenters , 2014, NSDI.

[9]  David Walker,et al.  Incremental consistent updates , 2013, HotSDN '13.

[10]  Ming Zhang,et al.  MicroTE: fine grained traffic engineering for data centers , 2011, CoNEXT '11.

[11]  Paul Hudak,et al.  Maple: simplifying SDN programming using algorithmic policies , 2013, SIGCOMM.

[12]  Yin Zhang,et al.  COPE: traffic engineering in dynamic networks , 2006, SIGCOMM 2006.

[13]  Chen-Nee Chuah,et al.  Graceful Network State Migrations , 2011, IEEE/ACM Transactions on Networking.

[14]  Albert G. Greenberg,et al.  Ananta: cloud scale load balancing , 2013, SIGCOMM.

[15]  Olivier Bonaventure,et al.  Avoiding disruptions during maintenance operations on BGP sessions , 2007, IEEE Transactions on Network and Service Management.

[16]  Sujata Banerjee,et al.  ElasticTree: Saving Energy in Data Center Networks , 2010, NSDI.

[17]  Arun Venkataramani,et al.  Consensus Routing: The Internet as a Distributed System. (Best Paper) , 2008, NSDI.

[18]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[19]  Edith Cohen,et al.  Making intra-domain routing robust to changing and uncertain traffic demands: understanding fundamental tradeoffs , 2003, SIGCOMM '03.

[20]  Katerina J. Argyraki,et al.  Toward a verifiable software dataplane , 2013, HotNets.

[21]  Nick Feamster,et al.  Detecting BGP configuration faults with static analysis , 2005 .

[22]  Hitesh Ballani,et al.  Towards predictable datacenter networks , 2011, SIGCOMM 2011.

[23]  Matthew Caesar,et al.  Walk the line: consistent network updates with bandwidth guarantees , 2012, HotSDN '12.

[24]  Samuel T. King,et al.  Debugging the data plane with anteater , 2011, SIGCOMM 2011.

[25]  Ratul Mahajan,et al.  On consistent updates in software defined networks , 2013, HotNets.

[26]  Alon Itai,et al.  On the complexity of time table and multi-commodity flow problems , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[27]  Srikanth Kandula,et al.  Achieving high utilization with software-driven WAN , 2013, SIGCOMM.

[28]  Srikanth Kandula,et al.  Traffic engineering with forward fault correction , 2015, SIGCOMM 2015.

[29]  Marco Canini,et al.  ESPRES: Easy Scheduling and Prioritization for SDN , 2014, ONS.

[30]  Nick Feamster,et al.  The road to SDN: an intellectual history of programmable networks , 2014, CCRV.

[31]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[32]  David Walker,et al.  Composing Software Defined Networks , 2013, NSDI.

[33]  Chen Liang,et al.  Participatory networking: an API for application control of SDNs , 2013, SIGCOMM.

[34]  Scott Shenker,et al.  Ethane: taking control of the enterprise , 2007, SIGCOMM.

[35]  Paul Francis,et al.  CONMan: a step towards network manageability , 2007, SIGCOMM 2007.

[36]  Pei-Hsin Ho,et al.  Abstraction refinement by controllability and cooperativeness analysis , 2004, Proceedings. 41st Design Automation Conference, 2004..

[37]  Nick McKeown,et al.  MPLS-TE and MPLS VPNS with openflow , 2011, SIGCOMM.

[38]  Cheng Jin,et al.  MATE: MPLS adaptive traffic engineering , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[39]  Srikanth Kandula,et al.  Walking the tightrope: responsive yet stable traffic engineering , 2005, SIGCOMM '05.

[40]  Boleslaw K. Szymanski,et al.  Scalability and performance of an agent‐based network management middleware , 2004 .

[41]  Doug Terry,et al.  Replicated data consistency explained through baseball , 2013, CACM.

[42]  Rob Sherwood,et al.  Can the Production Network Be the Testbed? , 2010, OSDI.

[43]  Samuli Aalto,et al.  Adaptive load balancing with OSPF , 2009 .

[44]  Martín Casado,et al.  Rethinking enterprise network control , 2009, TNET.

[45]  Srikanth Kandula,et al.  Dynamic scheduling of network updates (Extended version) , 2014 .

[46]  Robert M. Pap,et al.  Fault Diagnosis , 1990, Bayesian Networks in Fault Diagnosis.

[47]  David Walker,et al.  Abstractions for network update , 2012, SIGCOMM '12.

[48]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[49]  Paparao Palacharla,et al.  Network reconfiguration targeting minimum connection disruption , 2014, 2014 International Conference on Optical Network Design and Modeling.

[50]  Ao Tang,et al.  Congestion-free routing reconfiguration: Formulation and examples , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[51]  Navendu Jain,et al.  Understanding network failures in data centers , 2011, SIGCOMM 2011.

[52]  Alia Atlas,et al.  Basic Specification for IP Fast Reroute: Loop-Free Alternates , 2008, RFC.

[53]  Rob Sherwood,et al.  OFLOPS: An Open Framework for OpenFlow Switch Evaluation , 2012, PAM.

[54]  Rick McGeer A correct, zero-overhead protocol for network updates , 2013, HotSDN '13.

[55]  Martín Casado,et al.  Software-defined internet architecture: decoupling architecture from infrastructure , 2012, HotNets-XI.

[56]  David Walker,et al.  Consistent updates for software-defined networks: change you can believe in! , 2011, HotNets-X.

[57]  Stefan Savage,et al.  California fault lines: understanding the causes and impact of network failures , 2010, SIGCOMM '10.

[58]  Edith Cohen,et al.  Coping with network failures: routing strategies for optimal demand oblivious restoration , 2004, SIGMETRICS '04/Performance '04.

[59]  Koushik Kar,et al.  Routing restorable bandwidth guaranteed connections using maximum 2-route flows , 2003, TNET.

[60]  Hong Yan,et al.  A clean slate 4D approach to network control and management , 2005, CCRV.

[61]  Tal Mizrahi,et al.  Time-based updates in software defined networks , 2013, HotSDN '13.

[62]  Bruce M. Maggs,et al.  R-BGP: Staying Connected in a Connected World , 2007, NSDI.

[63]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[64]  George Varghese,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 99 Real Time Network Policy Checking Using Header Space Analysis , 2022 .

[65]  Martín Casado,et al.  Fabric: a retrospective on evolving SDN , 2012, HotSDN '12.

[66]  Murali S. Kodialam,et al.  Dynamic routing of restorable bandwidth-guaranteed tunnels using aggregated network resource usage information , 2003, TNET.

[67]  Martín Casado,et al.  NOX: towards an operating system for networks , 2008, CCRV.

[68]  Martin Radetzki Fault-Tolerant Differential Q Routing in Arbitrary NoC Topologies , 2011, 2011 IFIP 9th International Conference on Embedded and Ubiquitous Computing.

[69]  Sriram Ramabhadran,et al.  Cloud control with distributed rate limiting , 2007, SIGCOMM 2007.

[70]  Chen-Nee Chuah,et al.  Characterization of Failures in an Operational IP Backbone Network , 2008, IEEE/ACM Transactions on Networking.

[71]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[72]  John M. Noble,et al.  Bayesian Networks: An Introduction , 2009 .

[73]  Nick McKeown,et al.  Designing a Fault-Tolerant Network Using Valiant Load-Balancing , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[74]  Rick McGeer,et al.  A safe, efficient update protocol for openflow networks , 2012, HotSDN '12.

[75]  Olivier Bonaventure,et al.  Seamless network-wide IGP migrations , 2011, SIGCOMM.

[76]  Pavol Cerný,et al.  Toward Synthesis of Network Updates , 2014, SYNT.

[77]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[78]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM 2011.

[79]  Jorge Lobo,et al.  Policy-based management of networked computing systems , 2005, IEEE Communications Magazine.

[80]  George Varghese,et al.  Header Space Analysis: Static Checking for Networks , 2012, NSDI.

[81]  Tal Garfinkel,et al.  SANE: A Protection Architecture for Enterprise Networks , 2006, USENIX Security Symposium.

[82]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[83]  Paramvir Bahl,et al.  Flyways To De-Congest Data Center Networks , 2009, HotNets.

[84]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[85]  Olivier Bonaventure,et al.  Lossless Migrations of Link-State IGPs , 2012, IEEE/ACM Transactions on Networking.

[86]  Yin Zhang,et al.  R3: resilient routing reconfiguration , 2010, SIGCOMM '10.

[87]  Amin Vahdat,et al.  Hedera: Dynamic Flow Scheduling for Data Center Networks , 2010, NSDI.

[88]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .