A Survey on Fault Management in Software-Defined Networks

Software-defined networking (SDN) is an emerging paradigm that has become increasingly popular in recent years. The core idea is to separate the control and data planes, allowing the construction of network applications using high-level abstractions that are translated to network devices through a southbound interface. SDN architecture is composed of three layers: 1) infrastructure layer, responsible exclusively for data forwarding; 2) control layer, which maintains the network view and provides core network abstractions; and 3) application layer, which uses abstractions provided by the control layer to implement network applications. SDN provides features, such as flexibility and programmability, that are key enablers to meet current network requirements (e.g., multi-tenant cloud networks and elastic optical networks). However, along with its benefits, SDN also brings new issues. In this survey we focus on issues related to fault management. Different fault management threat vectors are introduced by each layer, as well as by the interface between layers. Nevertheless, besides addressing fault management issues of its architecture, SDN also must handle the same problems faced by legacy networks. However, programmability and centralized management might be used to provide flexibility to deal with those issues. This paper presents an overview of fault management in SDN. The major contributions of this paper are as follows: 1) identification of the main fault management issues in SDN and classification according to the affected layers; 2) survey of efforts that address those issues and classification according to the affected planes, issues concerned, general approaches, and features; and 3) discussion about trade-offs of different approaches and their suitability for different scenarios.

[1]  Jamal Hadi Salim,et al.  Forwarding and Control Element Separation (ForCES) Protocol Specification , 2010, RFC.

[2]  Steven S. W. Lee,et al.  Software-based fast failure recovery for resilient OpenFlow networks , 2015, 2015 7th International Workshop on Reliable Networks Design and Modeling (RNDM).

[3]  Barbara Liskov,et al.  Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems , 1999, PODC '88.

[4]  Zhi Liu,et al.  Troubleshooting blackbox SDN control software with minimal causal sequences , 2014 .

[5]  Mathieu Bouet,et al.  Implementing OpenFlow-based resilient network services , 2012, 2012 IEEE 1st International Conference on Cloud Networking (CLOUDNET).

[6]  Didier Colle,et al.  Software defined networking: Meeting carrier grade requirements , 2011, 2011 18th IEEE Workshop on Local & Metropolitan Area Networks (LANMAN).

[7]  Martín Casado,et al.  Practical declarative network management , 2009, WREN '09.

[8]  Fernando M. V. Ramos,et al.  On the Design of Practical Fault-Tolerant SDN Controllers , 2014, 2014 Third European Workshop on Software Defined Networks.

[9]  Jeremie Leguay,et al.  Dynamic control for failure recovery and flow reconfiguration in SDN , 2016, 2016 12th International Conference on the Design of Reliable Communication Networks (DRCN).

[10]  Harry Eugene Stanley,et al.  Catastrophic cascade of failures in interdependent networks , 2009, Nature.

[11]  Ramon Casellas,et al.  First Proof-of-Concept Demonstration of OpenFlow- Controlled Elastic Optical Networks Employing Flexible Transmitter/Receiver , 2012 .

[12]  Brian E. Carpenter,et al.  Autonomic Networking: Definitions and Design Goals , 2015, RFC.

[13]  Edjard de Souza Mota,et al.  Resilience of SDNs based On active and passive replication mechanisms , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).

[14]  Jon G. Riecke,et al.  Stability issues in OSPF routing , 2001, SIGCOMM.

[15]  Daniel Jackson,et al.  Software Abstractions - Logic, Language, and Analysis , 2006 .

[16]  Y Sone,et al.  Bandwidth Squeezed Restoration in Spectrum-Sliced Elastic Optical Path Networks (SLICE) , 2011, IEEE/OSA Journal of Optical Communications and Networking.

[17]  Joon-Min Gil,et al.  Reliable and Fault-Tolerant Software-Defined Network Operations Scheme for Remote 3D Printing , 2014, Journal of Electronic Materials.

[18]  Pontus Sköldström,et al.  Scalable fault management for OpenFlow , 2012, 2012 IEEE International Conference on Communications (ICC).

[19]  Giuseppe Bianchi,et al.  OpenState: programming platform-independent stateful openflow applications inside the switch , 2014, CCRV.

[20]  Junda Liu,et al.  Libra: Divide and Conquer to Verify Forwarding Tables in Huge Networks , 2014, NSDI.

[21]  Norihiko Shinomiya,et al.  A Failure Recovery Method Based on Cycle Structure and Its Verification by OpenFlow , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[22]  Thyaga Nandagopal,et al.  Coping with link failures in centralized control plane architectures , 2010, 2010 Second International Conference on COMmunication Systems and NETworks (COMSNETS 2010).

[23]  Lei Liu,et al.  Dynamic OpenFlow-Based Lightpath Restoration in Elastic Optical Networks on the GENI Testbed , 2015, Journal of Lightwave Technology.

[24]  Pier Luigi Ventre,et al.  ICONA: Inter Cluster Onos Network application , 2015, Proceedings of the 2015 1st IEEE Conference on Network Softwarization (NetSoft).

[25]  Simon Watts,et al.  5G resilient backhaul using integrated satellite networks , 2014, 2014 7th Advanced Satellite Multimedia Systems Conference and the 13th Signal Processing for Space Communications Workshop (ASMS/SPSC).

[26]  Li Xin,et al.  A framework of using OpenFlow to handle transient link failure , 2011, Proceedings 2011 International Conference on Transportation, Mechanical, and Electrical Engineering (TMEE).

[27]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[28]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[29]  Martín Casado,et al.  Onix: A Distributed Control Platform for Large-scale Production Networks , 2010, OSDI.

[30]  Lisandro Zambenedetti Granville,et al.  Data Center Network Virtualization: A Survey , 2013, IEEE Communications Surveys & Tutorials.

[31]  Stefano Secci,et al.  Reliability and Survivability Analysis of Data Center Network Topologies , 2015, Journal of Network and Systems Management.

[32]  Rob Sherwood,et al.  Can the Production Network Be the Testbed? , 2010, OSDI.

[33]  Oliver Michel,et al.  Applying operating system principles to SDN controller design , 2013, HotNets.

[34]  Piero Castoldi,et al.  OpenFlow-based segment protection in Ethernet networks , 2013, IEEE/OSA Journal of Optical Communications and Networking.

[35]  Kai Li,et al.  Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.

[36]  Fernando M. V. Ramos,et al.  Towards secure and dependable software-defined networks , 2013, HotSDN '13.

[37]  Kazuya Suzuki,et al.  A Design and Implementation of OpenFlow Controller Handling IP Multicast with Fast Tree Switching , 2012, 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet.

[38]  Theophilus Benson,et al.  Tolerating SDN Application Failures with LegoSDN , 2014, HotNets.

[39]  Alia Atlas,et al.  Basic Specification for IP Fast Reroute: Loop-Free Alternates , 2008, RFC.

[40]  Jun Bi,et al.  On the cascading failures of multi-controllers in Software Defined Networks , 2013, 2013 21st IEEE International Conference on Network Protocols (ICNP).

[41]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[42]  Steven S. W. Lee,et al.  Path layout planning and software based fast failure detection in survivable OpenFlow networks , 2014, 2014 10th International Conference on the Design of Reliable Communication Networks (DRCN).

[43]  Christian Esteve Rothenberg,et al.  SlickFlow: Resilient source routing in Data Center Networks unlocked by OpenFlow , 2013, 38th Annual IEEE Conference on Local Computer Networks.

[44]  Alan L. Cox,et al.  Scalable Multi-Failure Fast Failover via Forwarding Table Compression , 2016, SOSR.

[45]  Ulas C. Kozat,et al.  On diagnosis of forwarding plane via static forwarding rules in Software Defined Networks , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[46]  Lei Shi,et al.  Dcell: a scalable and fault-tolerant network structure for data centers , 2008, SIGCOMM '08.

[47]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[48]  Antonio Capone,et al.  Detour planning for fast and reliable failure recovery in SDN with OpenState , 2014, 2015 11th International Conference on the Design of Reliable Communication Networks (DRCN).

[49]  Ying Zhang,et al.  Fast failover for control traffic in Software-defined Networks , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[50]  Pavlin Radoslavov,et al.  ONOS: towards an open, distributed SDN OS , 2014, HotSDN.

[51]  Xirong Que,et al.  Reliability-aware controller placement for Software-Defined Networks , 2013, 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013).

[52]  Sergio Rajsbaum ACM SIGACT news distributed computing column 5 , 2001, SIGA.

[53]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[54]  Ying Zhang,et al.  On Resilience of Split-Architecture Networks , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[55]  Rodolfo da Silva Villaça,et al.  Resilient Strategies to SDN: An Approach Focused on Actively Replicated Controllers , 2015, 2015 XXXIII Brazilian Symposium on Computer Networks and Distributed Systems.

[56]  Ying Zhang,et al.  NetRevert: rollback recovery in SDN , 2014, HotSDN.

[57]  Yashar Ganjali,et al.  HyperFlow: A Distributed Control Plane for OpenFlow , 2010, INM/WREN.

[58]  Yonggang Wen,et al.  “ A Survey of Software Defined Networking , 2020 .

[59]  Shriram Krishnamurthi,et al.  Tierless Programming and Reasoning for Software-Defined Networks , 2014, NSDI.

[60]  Steve Vinoski,et al.  Advanced Message Queuing Protocol , 2006, IEEE Internet Computing.

[61]  Nael B. Abu-Ghazaleh,et al.  Wireless Software Defined Networking: A Survey and Taxonomy , 2016, IEEE Communications Surveys & Tutorials.

[62]  Phuoc Tran-Gia,et al.  POCO-framework for Pareto-optimal resilient controller placement in SDN-based core networks , 2014, 2014 IEEE Network Operations and Management Symposium (NOMS).

[63]  Shigeki Yamada,et al.  A Software-Defined Networking Approach for Disaster-Resilient WANs , 2013, 2013 22nd International Conference on Computer Communication and Networks (ICCCN).

[64]  Stefan Schmid,et al.  SHEAR: A Highly Available and Flexible Network Architecture Marrying Distributed and Logically Centralized Control Planes , 2015, 2015 IEEE 23rd International Conference on Network Protocols (ICNP).

[65]  Kazuya Suzuki,et al.  A Multicast Tree Management Method Supporting Fast Failure Recovery and Dynamic Group Membership Changes in OpenFlow Networks , 2016, J. Inf. Process..

[66]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[67]  Adilson E Motter,et al.  Cascade-based attacks on complex networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[68]  Alysson Neves Bessani,et al.  State Machine Replication for the Masses with BFT-SMART , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[69]  Sanming Zhou,et al.  Networking for Big Data: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[70]  Yvon Gourhant,et al.  Cross-control: A scalable multi-topology fault restoration mechanism using logically centralized controllers , 2014, 2014 IEEE 15th International Conference on High Performance Switching and Routing (HPSR).

[71]  Thierry Turletti,et al.  A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks , 2014, IEEE Communications Surveys & Tutorials.

[72]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM.

[73]  Nicola Blefari-Melazzi,et al.  Controller selection in a Wireless Mesh SDN under network partitioning and merging scenarios , 2014, ArXiv.

[74]  Luke M. Leslie,et al.  The Tempest-a practical framework for network programmability , 1998, IEEE Netw..

[75]  Koji Okamura,et al.  Fast failover mechanism for software defined networking: OpenFlow based , 2014, CFI '14.

[76]  Kireeti Kompella,et al.  Internet Engineering Task Force (ietf) Bidirectional Forwarding Detection (bfd) for Mpls Label Switched Paths (lsps) , 2010 .

[77]  Srikanth Kandula,et al.  Traffic engineering with forward fault correction , 2014, SIGCOMM.

[78]  Scott Shenker,et al.  How Did We Get Into This Mess? Isolating Fault- Inducing Inputs to SDN Control Software , 2013 .

[79]  Nicola Blefari-Melazzi,et al.  Wireless Mesh Software Defined Networks (wmSDN) , 2013, 2013 IEEE 9th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob).

[80]  Kuochen Wang,et al.  Fast Controller Failover for Multi-domain Software-Defined Networks , 2015, 2015 European Conference on Networks and Communications (EuCNC).

[81]  Laura L. Pullum,et al.  Software Fault Tolerance Techniques and Implementation , 2001 .

[82]  Marco Canini,et al.  A NICE Way to Test OpenFlow Applications , 2012, NSDI.

[83]  Noel Crespi,et al.  Self-healing Mechanisms for Software Defined Networks , 2014, AIMS 2014.

[84]  Scott Shenker,et al.  What, Where, and When: Software Fault Localization for SDN , 2012 .

[85]  Francisco J. Ros,et al.  Five nines of southbound reliability in software-defined networks , 2014, HotSDN.

[86]  Phuoc Tran-Gia,et al.  POCO-PLC: Enabling dynamic pareto-optimal resilient controller placement in SDN networks , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[87]  András Gulyás,et al.  One tool to rule them all: a modular troubleshooting framework for SDN (and other) networks , 2015, SOSR.

[88]  David Walker,et al.  Frenetic: a network programming language , 2011, ICFP.

[89]  Marinho P. Barcellos,et al.  Off the wire control: Improving the control plane resilience through cellular networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[90]  Akram Hakiri,et al.  Leveraging SDN for The 5G Networks: Trends, Prospects and Challenges , 2015, ArXiv.

[91]  Junda Liu,et al.  Ensuring connectivity via data plane mechanisms , 2013, NSDI 2013.

[92]  Martín Casado,et al.  Abstractions for software-defined networks , 2014, Commun. ACM.

[93]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[94]  Sujata Banerjee,et al.  DevoFlow: scaling flow management for high-performance networks , 2011, SIGCOMM 2011.

[95]  Sharad Malik,et al.  An assertion language for debugging SDN applications , 2014, HotSDN.

[96]  Hani Jamjoom,et al.  Cementing high availability in openflow with RuleBricks , 2013, HotSDN '13.

[97]  Martín Casado,et al.  NOX: towards an operating system for networks , 2008, CCRV.

[98]  Brent Byunghoon Kang,et al.  Rosemary: A Robust, Secure, and High-performance Network Operating System , 2014, CCS.

[99]  Andrei V. Gurtov,et al.  Security in Software Defined Networks: A Survey , 2015, IEEE Communications Surveys & Tutorials.

[100]  Marco Canini,et al.  Automatic failure recovery for software-defined networks , 2013, HotSDN '13.

[101]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[102]  Ion Stoica,et al.  Focus Replay Debugging Effort on the Control Plane , 2010, HotDep.

[103]  Gerhard Reinelt,et al.  A tabu search algorithm for the min-max k-Chinese postman problem , 2006, Comput. Oper. Res..

[104]  Thomas Pfeiffenberger,et al.  Reliable and flexible communications for power systems: Fault-tolerant multicast with SDN/OpenFlow , 2015, 2015 7th International Conference on New Technologies, Mobility and Security (NTMS).

[105]  Michael Schapira,et al.  VeriCon: towards verifying controller programs in software-defined networks , 2014, PLDI.

[106]  Gunjan Tank,et al.  Software-Defined Networking-The New Norm for Networks , 2012 .

[107]  Fernando M. V. Ramos,et al.  Software-Defined Networking: A Comprehensive Survey , 2014, Proceedings of the IEEE.

[108]  Nick McKeown,et al.  Where is the debugger for my software-defined network? , 2012, HotSDN '12.

[109]  Anja Feldmann,et al.  Logically centralized?: state distribution trade-offs in software defined networks , 2012, HotSDN '12.

[110]  Koji Yamazaki,et al.  Accelerating SDN/NFV with Transparent Offloading Architecture , 2014, ONS.

[111]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[112]  H. Jonathan Chao,et al.  Use of devolved controllers in data center networks , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[113]  Didier Colle,et al.  Demonstrating resilient quality of service in Software Defined Networking , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[114]  George Varghese,et al.  Automatic Test Packet Generation , 2012, IEEE/ACM Transactions on Networking.

[115]  H. Jonathan Chao,et al.  Congestion-aware single link failure recovery in hybrid SDN networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[116]  Limin Xiao,et al.  High availability for Non-stop network controller , 2014, Proceeding of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks 2014.

[117]  Amin Vahdat,et al.  Aspen trees: balancing data center fault tolerance, scalability and cost , 2013, CoNEXT.

[118]  Jennifer L. Welch,et al.  Link Reversal: How to Play Better to Work Less , 2009, ALGOSENSORS.

[119]  Jae-Hyoung Yoo,et al.  Scalable failover method for Data Center Networks using OpenFlow , 2014, 2014 IEEE Network Operations and Management Symposium (NOMS).

[120]  Rodrigo Braga,et al.  Lightweight DDoS flooding attack detection using NOX/OpenFlow , 2010, IEEE Local Computer Network Conference.

[121]  Herbert Bos,et al.  Can we make operating systems reliable and secure? , 2006, Computer.

[122]  George Pavlou,et al.  Software-defined network support for transport resilience , 2014, 2014 IEEE Network Operations and Management Symposium (NOMS).

[123]  George Varghese,et al.  Header Space Analysis: Static Checking for Networks , 2012, NSDI.

[124]  James F. Kurose,et al.  Recovery from link failures in a Smart Grid communication network using OpenFlow , 2014, 2014 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[125]  Andreas Mauthe,et al.  Resilience support in software-defined networking: A survey , 2015, Comput. Networks.

[126]  Qiang Xu,et al.  Enabling layer 2 pathlet tracing through context encoding in software-defined networking , 2014, HotSDN.

[127]  Davide Sanvito,et al.  SPIDER: Fault resilient SDN pipeline with recovery delay guarantees , 2015, 2016 IEEE NetSoft Conference and Workshops (NetSoft).

[128]  Rob Sherwood,et al.  FlowVisor: A Network Virtualization Layer , 2009 .

[129]  William L. Goffe,et al.  SIMANN: FORTRAN module to perform Global Optimization of Statistical Functions with Simulated Annealing , 1992 .

[130]  Amin Vahdat,et al.  Scalability vs. Fault Tolerance in Aspen Trees , 2013 .

[131]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[132]  Katsuyoshi Iida,et al.  ResilientFlow: Deployments of distributed control channel maintenance modules to recover SDN from unexpected failures , 2015, 2015 11th International Conference on the Design of Reliable Communication Networks (DRCN).

[133]  Stefan Schmid,et al.  How (Not) to Shoot in Your Foot with SDN Local Fast Failover - A Load-Connectivity Tradeoff , 2013, OPODIS.

[134]  P. Castoldi,et al.  Fast restoration in SDN-based flexible optical networks , 2014, OFC 2014.

[135]  Fernando A. Kuipers,et al.  Fast Recovery in Software-Defined Networks , 2014, 2014 Third European Workshop on Software Defined Networks.

[136]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[137]  Jennifer Rexford,et al.  Toward Software-Defined Cellular Networks , 2012, 2012 European Workshop on Software Defined Networking.

[138]  Minzhe Guo,et al.  Controller Placement for Improving Resilience of Software-Defined Networks , 2013, 2013 Fourth International Conference on Networking and Distributed Computing.

[139]  Song Guo,et al.  Byzantine-resilient secure software-defined networks with multiple controllers , 2014, 2014 IEEE International Conference on Communications (ICC).

[140]  Marcos Rogério Salvador,et al.  Virtual routers as a service: the RouteFlow approach leveraging software-defined networks , 2011, CFI.

[141]  Didier Colle,et al.  Enabling fast failure recovery in OpenFlow networks , 2011, 2011 8th International Workshop on the Design of Reliable Communication Networks (DRCN).

[142]  Ehab Al-Shaer,et al.  FlowChecker: configuration analysis and verification of federated openflow infrastructures , 2010, SafeConfig '10.

[143]  Marco Canini,et al.  OFTEN Testing OpenFlow Networks , 2012, 2012 European Workshop on Software Defined Networking.

[144]  Edjard de Souza Mota,et al.  A replication component for resilient OpenFlow-based networking , 2012, 2012 IEEE Network Operations and Management Symposium.

[145]  Michael J. Freedman,et al.  Ravana: controller fault-tolerance in software-defined networking , 2015, SOSR.

[146]  Song Guo,et al.  Byzantine-Resilient Secure Software-Defined Networks with Multiple Controllers in Cloud , 2014, IEEE Transactions on Cloud Computing.

[147]  Michiaki Hayashi,et al.  Scalable OpenFlow Controller Redundancy Tackling Local and Global Recoveries , 2013 .

[148]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[149]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[150]  Anass Benjebbour,et al.  Design considerations for a 5G network architecture , 2014, IEEE Communications Magazine.

[151]  Zhiming Wang,et al.  Survivable Virtual Network Mapping using optimal backup topology in virtualized SDN , 2014, China Communications.

[152]  Daniel Corujo,et al.  A fail-safe SDN bridging platform for cloud networks , 2014, 2014 16th International Telecommunications Network Strategy and Planning Symposium (Networks).

[153]  Yonggang Wen,et al.  A Survey on Data Center Networking (DCN): Infrastructure and Operations , 2017, IEEE Communications Surveys & Tutorials.

[154]  Nick Feamster,et al.  CORONET: Fault tolerance for Software Defined Networks , 2012, 2012 20th IEEE International Conference on Network Protocols (ICNP).

[155]  Sakir Sezer,et al.  A Survey of Security in Software Defined Networks , 2016, IEEE Communications Surveys & Tutorials.

[156]  Raj Jain,et al.  Network virtualization and software defined networking for cloud computing: a survey , 2013, IEEE Communications Magazine.

[157]  Paul Barford,et al.  Fast, accurate simulation for SDN prototyping , 2013, HotSDN '13.

[158]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[159]  Thomas E. Anderson,et al.  F10: A Fault-Tolerant Engineered Network , 2013, NSDI.

[160]  P. Castoldi,et al.  Effective flow protection in OpenFlow rings , 2013, 2013 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC).

[161]  Nick McKeown,et al.  I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks , 2014, NSDI.

[162]  Wolfgang Kellerer,et al.  Software Defined Optical Networks (SDONs): A Comprehensive Survey , 2015, IEEE Communications Surveys & Tutorials.

[163]  George Varghese,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 99 Real Time Network Policy Checking Using Header Space Analysis , 2022 .

[164]  Gil Zussman,et al.  Enabling autonomic provisioning in SDN cloud networks with NFV service chaining , 2014, OFC 2014.

[165]  Vincent Gramoli,et al.  Disaster-Tolerant Storage with SDN , 2015, NETYS.

[166]  Nick McKeown,et al.  Leveraging SDN layering to systematically troubleshoot networks , 2013, HotSDN '13.

[167]  Yoram Haddad,et al.  Wireless Software Defined Networks: Challenges and opportunities , 2013, 2013 IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems (COMCAS 2013).

[168]  David Walker,et al.  A compiler and run-time system for network programming languages , 2012, POPL '12.

[169]  Marco Canini,et al.  FatTire: declarative fault tolerance for software-defined networks , 2013, HotSDN '13.

[170]  Olivier Bonaventure,et al.  Opportunities and research challenges of hybrid software defined networks , 2014, CCRV.

[171]  Dave Katz,et al.  Bidirectional Forwarding Detection (BFD) , 2010, RFC.

[172]  Didier Colle,et al.  In-band control, queuing, and failure recovery functionalities for openflow , 2016, IEEE Network.

[173]  Paul Barford,et al.  Controller-agnostic SDN Debugging , 2014, CoNEXT.

[174]  Seungjoon Lee,et al.  Network function virtualization: Challenges and opportunities for innovations , 2015, IEEE Communications Magazine.

[175]  Yasuo Okabe,et al.  Fast Failure Detection of OpenFlow Channels , 2015, AINTEC.

[176]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[177]  Yongli Zhao,et al.  Multipath protection for data center services in OpenFlow-based software defined elastic optical networks , 2015 .

[178]  Laurent Vanbever,et al.  Destroying networks for fun (and profit) , 2015, HotNets.

[179]  Mathis Obadia,et al.  Failover mechanisms for distributed SDN controllers , 2014, 2014 International Conference and Workshop on the Network of the Future (NOF).

[180]  Danda B. Rawat,et al.  Software Defined Networking Architecture, Security and Energy Efficiency: A Survey , 2017, IEEE Communications Surveys & Tutorials.

[181]  Wei Zhang,et al.  When Software Defined Networks Meet Fault Tolerance: A Survey , 2015, ICA3PP.

[182]  David A. Maltz,et al.  Unraveling the Complexity of Network Management , 2009, NSDI.

[183]  Sudipto Guha,et al.  Approximation algorithms for directed Steiner problems , 1999, SODA '98.

[184]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[185]  Rob Sherwood,et al.  The controller placement problem , 2012, HotSDN '12.

[186]  Shigeki Yamada,et al.  On the resilience of software defined routing platform , 2014, The 16th Asia-Pacific Network Operations and Management Symposium.

[187]  Kim-Kwang Raymond Choo,et al.  Security, Privacy, and Anonymity in Computation, Communication, and Storage , 2017, Lecture Notes in Computer Science.

[188]  Mourad Debbabi,et al.  A Survey and a Layered Taxonomy of Software-Defined Networking , 2014, IEEE Communications Surveys & Tutorials.

[189]  Tram Truong Huu,et al.  TCAM-Aware Local Rerouting for Fast and Efficient Failure Recovery in Software Defined Networks , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[190]  Sakir Sezer,et al.  Queen ' s University Belfast-Research Portal Are We Ready for SDN ? Implementation Challenges for Software-Defined Networks , 2016 .

[191]  Anja Feldmann,et al.  OFRewind: Enabling Record and Replay Troubleshooting for Networks , 2011, USENIX Annual Technical Conference.

[192]  Chaitanya Swamy,et al.  Fault-tolerant facility location , 2003, SODA '03.

[193]  Michael Menth,et al.  Scalable resilience for Software-Defined Networking using Loop-Free Alternates with loop detection , 2015, Proceedings of the 2015 1st IEEE Conference on Network Softwarization (NetSoft).

[194]  Minlan Yu,et al.  Rethinking virtual network embedding: substrate support for path splitting and migration , 2008, CCRV.

[195]  Alan L. Cox,et al.  Plinko: building provably resilient forwarding tables , 2013, HotNets.

[196]  Olivier Tilmans,et al.  IGP-as-a-backup for robust SDN networks , 2014, 10th International Conference on Network and Service Management (CNSM) and Workshop.

[197]  R. Smeliansky,et al.  Controller failover for SDN enterprise networks , 2014, 2014 First International Science and Technology Conference (Modern Networking Technologies) (MoNeTeC).

[198]  H. Jonathan Chao,et al.  Scalability and Resilience in Data Center Networks: Dynamic Flow Reroute as an Example , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[199]  Chi Harold Liu,et al.  Control traffic protection in software-defined networks , 2014, 2014 IEEE Global Communications Conference.

[200]  Laizhong Cui,et al.  When big data meets software-defined networking: SDN for big data and big data for SDN , 2016, IEEE Network.

[201]  Ram Dantu,et al.  Forwarding and Control Element Separation (ForCES) Framework , 2004, RFC.

[202]  Ramon Casellas,et al.  OpenSlice: An OpenFlow-based control plane for spectrum sliced elastic optical path networks , 2012, 2012 38th European Conference and Exhibition on Optical Communications.

[203]  Didier Colle,et al.  OpenFlow: Meeting carrier-grade recovery requirements , 2013, Comput. Commun..

[204]  Magnos Martinello,et al.  A Survey on SDN Programming Languages: Toward a Taxonomy , 2016, IEEE Communications Surveys & Tutorials.

[205]  Didier Colle,et al.  Fast failure recovery for in-band OpenFlow networks , 2013, 2013 9th International Conference on the Design of Reliable Communication Networks (DRCN).

[206]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[207]  David Hutchison,et al.  Resilience and survivability in communication networks: Strategies, principles, and survey of disciplines , 2010, Comput. Networks.

[208]  Nattapong Kitsuwan,et al.  A novel protection design for OpenFlow-based networks , 2014, 2014 16th International Conference on Transparent Optical Networks (ICTON).

[209]  Osgi Alliance,et al.  Osgi Service Platform, Release 3 , 2003 .

[210]  Ting Wang,et al.  QoS-aware optical burst switching in OpenFlow based Software-Defined Optical Networks , 2013, 2013 17th International Conference on Optical Networking Design and Modeling (ONDM).

[211]  Chen-Nee Chuah,et al.  Fast Local Rerouting for Handling Transient Link Failures , 2007, IEEE/ACM Transactions on Networking.

[212]  Mathieu Bouet,et al.  DISCO: Distributed multi-domain SDN controllers , 2013, 2014 IEEE Network Operations and Management Symposium (NOMS).

[213]  Michiaki Hayashi,et al.  Redundancy Method for Highly Available OpenFlow Controller , 2014 .

[214]  Stefan Schmid,et al.  Provable data plane connectivity with local fast failover: introducing openflow graph algorithms , 2014, HotSDN.

[215]  Luciano Paschoal Gaspary,et al.  Survivor: An enhanced controller placement strategy for improving SDN survivability , 2014, 2014 IEEE Global Communications Conference.

[216]  Antonio Ken Iannillo,et al.  Network Function Virtualization: Challenges and Directions for Reliability Assurance , 2014, 2014 IEEE International Symposium on Software Reliability Engineering Workshops.

[217]  Chung-Horng Lung,et al.  An Openflow-Based Approach to Failure Detection and Protection for a Multicasting Tree , 2015, WWIC.