SDNProbe: Lightweight Fault Localization in the Error-Prone Environment

Probe-based fault localization identifies potential faulty nodes, which are manually inspected for confirmation. This work explores efficient and accurate fault localization, which is crucial for reducing the manual effort without affecting network functionality. Prior work suffers from either high bandwidth overhead or false detection (i.e., incorrectly attributing good nodes or missing faulty nodes), especially in the presence of multiple or inconsistent faults. We propose SDNProbe, a lightweight SDN application that sends a provably minimized number of probe packets to pinpoint malfunctioning switches. We extend SDNProbe to randomize tested paths and packet headers to further improve the detection accuracy. Using realistic topologies and flow rules, our evaluation results confirm that SDNProbe can rapidly localize faulty switches while reducing the number of required test packets by 30%, compared to prior approaches. Even with 50% of switches being faulty, the extended SDNProbe can detect all faulty switches in 33 seconds, whereas prior approaches have false negative rates of 15-40%.

[1]  Jan Medved,et al.  OpenDaylight: Towards a Model-Driven SDN Controller architecture , 2014, Proceeding of IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks 2014.

[2]  Dejan Kostic,et al.  Rule-level Data Plane Monitoring With Monocle , 2015, Comput. Commun. Rev..

[3]  Chin-Laung Lei,et al.  How to detect a compromised SDN switch , 2015, Proceedings of the 2015 1st IEEE Conference on Network Softwarization (NetSoft).

[4]  Samuel T. King,et al.  Debugging the data plane with anteater , 2011, SIGCOMM 2011.

[5]  Kostas Pentikousis,et al.  Software-Defined Networking (SDN): Layers and Architecture Terminology , 2015, RFC.

[6]  Yih-Chun Hu,et al.  Lightweight source authentication and path validation , 2015, SIGCOMM 2015.

[7]  David Eppstein,et al.  Finding the k shortest paths , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[8]  Hsu-Chun Hsiao,et al.  Securing data planes in software-defined networks , 2016, 2016 IEEE NetSoft Conference and Workshops (NetSoft).

[9]  George Varghese,et al.  Header Space Analysis: Static Checking for Networks , 2012, NSDI.

[10]  Nick McKeown,et al.  Where is the debugger for my software-defined network? , 2012, HotSDN '12.

[11]  George Varghese,et al.  Automatic Test Packet Generation , 2012, IEEE/ACM Transactions on Networking.

[12]  Chen Qian,et al.  Pronto: Efficient Test Packet Generation for Dynamic Network Data Planes , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[13]  Torsten Hoefler,et al.  SDNsec: Forwarding Accountability for the SDN Data Plane , 2016, 2016 25th International Conference on Computer Communication and Networks (ICCCN).

[14]  Niklas Sörensson,et al.  An Extensible SAT-solver , 2003, SAT.

[15]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[16]  C Berge,et al.  TWO THEOREMS IN GRAPH THEORY. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[18]  Hongkun Yang,et al.  Real-Time Verification of Network Properties Using Atomic Predicates , 2016, IEEE/ACM Trans. Netw..

[19]  Bo Yang,et al.  Is every flow on the right track?: Inspect SDN forwarding with RuleScope , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[20]  R. P. Dilworth,et al.  A DECOMPOSITION THEOREM FOR PARTIALLY ORDERED SETS , 1950 .

[21]  Martin E. Dyer,et al.  Randomized Greedy Matching , 1991, Random Struct. Algorithms.

[22]  Dejan Kostic,et al.  Monocle: dynamic, fine-grained data plane monitoring , 2015, CoNEXT.

[23]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[24]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[25]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[26]  Nick McKeown,et al.  I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks , 2014, NSDI.

[27]  Dest,et al.  ShortMAC : Efficient Data-plane Fault Localization , 2012 .

[28]  George Varghese,et al.  Real Time Network Policy Checking Using Header Space Analysis , 2013, NSDI.

[29]  Nick McKeown,et al.  Leveraging SDN layering to systematically troubleshoot networks , 2013, HotSDN '13.

[30]  Xin Zhang,et al.  Secure and Scalable Fault Localization under Dynamic Traffic Patterns , 2012, 2012 IEEE Symposium on Security and Privacy.

[31]  Marco Canini,et al.  A NICE Way to Test OpenFlow Applications , 2012, NSDI.