Service Oriented Verification Integrated Fault Reasoning for SDNs

Fault localization is a core element in SDN network management. Many SDN fault reasoning and verification techniques assist operators focus on either analyzing the control plane configuration or checking the data plane network behavior. These solutions are limited in that they cannot correlate network symptoms between the control and the data planes, and are harder to generalize across protocols since they have to model complex configuration languages and dynamic protocol behavior. This paper proposes a new approach called Service Oriented Verification Integrated Reasoning (SOVIR) to tackle SDN fault reasoning. In the SOVIR system, a network user can request one or multiple network services via a high level Service Provisioning Language (SPL). SOVIR automatically parses each provisioned service and presents it as a logical Service View, which consists of a pair of logical end nodes, a service specification, and a list of required network functions (e.g., load balancer). After provisioned in an SDN network, SOVIR queries the controller about the network topology and flow rules from all SDN switches. Based on the flow rules and the configuration of end nodes and network function nodes, SOVIR maps the Service View to an Implementation View, in which all the logical components in the Service View are mapped to the actual system components along with the actual network topology. SOVIR uses an extended Symptom-Fault-Verification model to incorporate various verification techniques systematically into fault reasoning process to localize the faults in SDN. SOVIR has been evaluated in a simulation environment for its accuracy and efficiency. The evaluation shows that with SOVIR, both performance and accuracy of fault reasoning in the simulated SDN networks can be greatly improved by taking properly selected verification tools on specific network entities.

[1]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[2]  Brighten Godfrey,et al.  Debugging the data plane with anteater , 2011, SIGCOMM.

[3]  George Varghese,et al.  Automatic Test Packet Generation , 2012, IEEE/ACM Transactions on Networking.

[4]  Mischa Schwartz,et al.  Schemes for fault identification in communication networks , 1995, TNET.

[5]  Sharad Malik,et al.  An assertion language for debugging SDN applications , 2014, HotSDN.

[6]  Olivier Bonaventure,et al.  Towards test-driven software defined networking , 2014, 2014 IEEE Network Operations and Management Symposium (NOMS).

[7]  Nick McKeown,et al.  I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks , 2014, NSDI.

[8]  George Varghese,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 99 Real Time Network Policy Checking Using Header Space Analysis , 2022 .

[9]  Nick McKeown,et al.  Leveraging SDN layering to systematically troubleshoot networks , 2013, HotSDN '13.

[10]  D. Ohsie,et al.  High speed and robust event correlation , 1996, IEEE Commun. Mag..

[11]  Joe Armstrong,et al.  Concurrent programming in ERLANG , 1993 .

[12]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[13]  Scott Shenker,et al.  What, Where, and When: Software Fault Localization for SDN , 2012 .

[14]  Ying Zhang,et al.  PGA: Using Graphs to Express and Automatically Reconcile Network Policies , 2015, Comput. Commun. Rev..

[15]  George Varghese,et al.  Header Space Analysis: Static Checking for Networks , 2012, NSDI.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  Anja Feldmann,et al.  OFRewind: Enabling Record and Replay Troubleshooting for Networks , 2011, USENIX Annual Technical Conference.

[18]  Malgorzata Steinder,et al.  End-to-end service failure diagnosis using belief networks , 2002, NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327).

[19]  Ehab Al-Shaer,et al.  Efficient fault diagnosis using incremental alarm correlation and active investigation for internet and overlay networks , 2008, IEEE Transactions on Network and Service Management.

[20]  Zhi Liu,et al.  Troubleshooting blackbox SDN control software with minimal causal sequences , 2014 .

[21]  David Walker,et al.  Composing Software Defined Networks , 2013, NSDI.