Falcon: Differential fault localization for SDN control plane

Abstract The control plane of Software-Defined Networking (SDN) is the key component for overseeing and managing networks. As a software entity, the control plane is inevitable to involve design or logic flaws in its policy enforcement and network control, which can cause it to behave incorrectly and induce network anomalies. Existing approaches mainly focus on policy verification or fault troubleshooting, which have little fault localization capabilities for locating these flaws in production environments. In this paper, we present Falcon , the first Fa ult l ocalization tool for the SDN con trol plane. We design a novel causal inference mechanism based on differential checking, which symmetrically compares two system behaviors with similar processes and identifies the causality in related code execution paths with concrete contexts to explain why a fault happened in the SDN network. Our main contributions include (1) a lightweight rule-based hybrid tracing mechanism for recording system behaviors of the SDN control plane, (2) a context-aware modeling mechanism for modeling these behaviors, and (3) a differential checking mechanism for diagnosing controller faults according to formulated symptoms. Our evaluation shows that Falcon is capable of diagnosing faults in the SDN control plane with low overhead on performance.

[1]  Bo Yang,et al.  RuleTris: Minimizing Rule Update Latency for TCAM-Based SDN Switches , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[2]  Da Yu,et al.  Simon: scriptable interactive monitoring for SDNs , 2015, SOSR.

[3]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[4]  Nick McKeown,et al.  A network in a laptop: rapid prototyping for software-defined networks , 2010, Hotnets-IX.

[5]  Jennifer Neville,et al.  Structured Comparative Analysis of Systems Logs to Diagnose Performance Problems , 2012, NSDI.

[6]  Xing Li,et al.  Fault Management in Software-Defined Networking: A Survey , 2019, IEEE Communications Surveys & Tutorials.

[7]  Katerina J. Argyraki,et al.  Verifying Reachability in Networks with Mutable Datapaths , 2016, NSDI.

[8]  Colin Scott,et al.  Troubleshooting blackbox SDN control software with minimal causal sequences , 2015, SIGCOMM.

[9]  Anja Feldmann,et al.  OFRewind: Enabling Record and Replay Troubleshooting for Networks , 2011, USENIX Annual Technical Conference.

[10]  Michael Schapira,et al.  VeriCon: towards verifying controller programs in software-defined networks , 2014, PLDI.

[11]  Andreas Haeberlen,et al.  The Good, the Bad, and the Differences: Better Network Diagnostics with Differential Provenance , 2016, SIGCOMM.

[12]  Theophilus Benson,et al.  Isolating and Tolerating SDN Application Failures with LegoSDN , 2016, SOSR.

[13]  Nick McKeown,et al.  I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks , 2014, NSDI.

[14]  George Varghese,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 99 Real Time Network Policy Checking Using Header Space Analysis , 2022 .

[15]  Lei Xu,et al.  Attacking the Brain: Races in the SDN Control Plane , 2017, USENIX Security Symposium.

[16]  Shriram Krishnamurthi,et al.  Static Differential Program Analysis for Software-Defined Networks , 2015, FM.

[17]  Brent Byunghoon Kang,et al.  Rosemary: A Robust, Secure, and High-performance Network Operating System , 2014, CCS.

[18]  Marco Canini,et al.  A NICE Way to Test OpenFlow Applications , 2012, NSDI.

[19]  Jan Medved,et al.  OpenDaylight: Towards a Model-Driven SDN Controller architecture , 2014, Proceeding of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks 2014.

[20]  Xing Li,et al.  Thinking inside the Box: Differential Fault Localization for SDN Control Plane , 2019, 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[21]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[22]  Pavlin Radoslavov,et al.  ONOS: towards an open, distributed SDN OS , 2014, HotSDN.

[23]  Gargi Dasgupta,et al.  Anomaly Detection Using Program Control Flow Graph Mining From Execution Logs , 2016, KDD.

[24]  Vijay Mann,et al.  JURY: Validating Controller Actions in Software-Defined Networks , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[25]  Scott Shenker,et al.  What, Where, and When: Software Fault Localization for SDN , 2012 .

[26]  Airton Ishimori,et al.  On the Benchmarking Mainstream Open Software-Defined Networking Controllers , 2016, LANC.