A NICE Way to Test OpenFlow Applications

The emergence of OpenFlow-capable switches enables exciting new network functionality, at the risk of programming errors that make communication less reliable. The centralized programming model, where a single controller program manages the network, seems to reduce the likelihood of bugs. However, the system is inherently distributed and asynchronous, with events happening at different switches and end hosts, and inevitable delays affecting communication with the controller. In this paper, we present efficient, systematic techniques for testing unmodified controller programs. Our NICE tool applies model checking to explore the state space of the entire system--the controller, the switches, and the hosts. Scalability is the main challenge, given the diversity of data packets, the large system state, and the many possible event orderings. To address this, we propose a novel way to augment model checking with symbolic execution of event handlers (to identify representative packets that exercise code paths on the controller). We also present a simplified OpenFlow switch model (to reduce the state space), and effective strategies for generating event interleavings likely to uncover bugs. Our prototype tests Python applications on the popular NOX platform. In testing three real applications--a MAC-learning switch, in-network server load balancing, and energy-efficient traffic engineering--we uncover eleven bugs.

[1]  Amin Vahdat,et al.  Life, death, and the critical transition: finding liveness bugs in systems code , 2007 .

[2]  Russell J. Clark,et al.  Resonance: dynamic access control for enterprise networks , 2009, WREN '09.

[3]  Brighten Godfrey,et al.  Debugging the data plane with anteater , 2011, SIGCOMM.

[4]  Gerard J. Holzmann,et al.  The SPIN Model Checker - primer and reference manual , 2003 .

[5]  Dawson R. Engler,et al.  Model Checking Large Network Protocol Implementations , 2004, NSDI.

[6]  Klaus Havelund,et al.  Model checking programs , 2000, Proceedings ASE 2000. Fifteenth IEEE International Conference on Automated Software Engineering.

[7]  Richard Wang,et al.  OpenFlow-Based Server Load Balancing Gone Wild , 2011, Hot-ICE.

[8]  Sarfraz Khurshid,et al.  Generalized Symbolic Execution for Model Checking and Testing , 2003, TACAS.

[9]  Ramesh Govindan,et al.  Finding protocol manipulation attacks , 2011, SIGCOMM.

[10]  Marcelo d'Amorim,et al.  Assertion Checking in J-Sim Simulation Models of Network Protocols , 2010, Simul..

[11]  Martín Casado,et al.  Rethinking Enterprise Network Control , 2009, IEEE/ACM Transactions on Networking.

[12]  Christel Baier,et al.  Principles of Model Checking (Representation and Mind Series) , 2008 .

[13]  Klaus Wehrle,et al.  KleeNet: discovering insidious interaction bugs in wireless sensor networks before deployment , 2010, IPSN '10.

[14]  David Walker,et al.  Frenetic: a network programming language , 2011, ICFP.

[15]  Christel Baier,et al.  Principles of model checking , 2008 .

[16]  Ehab Al-Shaer,et al.  FlowChecker: configuration analysis and verification of federated openflow infrastructures , 2010, SafeConfig '10.

[17]  Haoxiang Lin,et al.  MODIST: Transparent Model Checking of Unmodified Distributed Systems , 2009, NSDI.

[18]  Marco Canini,et al.  Identifying and using energy-critical paths , 2011, CoNEXT '11.

[19]  George Candea,et al.  Parallel symbolic execution for automated real-world software testing , 2011, EuroSys '11.

[20]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[21]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[22]  Sujata Banerjee,et al.  ElasticTree: Saving Energy in Data Center Networks , 2010, NSDI.

[23]  Patrice Godefroid,et al.  Dynamic partial-order reduction for model checking software , 2005, POPL '05.

[24]  Michael Norrish,et al.  Rigorous specification and conformance testing techniques for network protocols, as applied to TCP, UDP, and sockets , 2005, SIGCOMM '05.

[25]  Anja Feldmann,et al.  OFRewind: Enabling Record and Replay Troubleshooting for Networks , 2011, USENIX Annual Technical Conference.

[26]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[27]  Marco Canini,et al.  Is your OpenFlow application correct? , 2011, CoNEXT '11 Student.

[28]  Martín Casado,et al.  NOX: towards an operating system for networks , 2008, CCRV.

[29]  Marco Canini,et al.  Automating the Testing of OpenFlow Applications , 2011 .

[30]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.