Debugging the data plane with anteater

Diagnosing problems in networks is a time-consuming and error-prone process. Existing tools to assist operators primarily focus on analyzing control plane configuration. Configuration analysis is limited in that it cannot find bugs in router software, and is harder to generalize across protocols since it must model complex configuration languages and dynamic protocol behavior. This paper studies an alternate approach: diagnosing problems through static analysis of the data plane. This approach can catch bugs that are invisible at the level of configuration files, and simplifies unified analysis of a network across many protocols and implementations. We present Anteater, a tool for checking invariants in the data plane. Anteater translates high-level network invariants into instances of boolean satisfiability problems (SAT), checks them against network state using a SAT solver, and reports counterexamples if violations have been found. Applied to a large university network, Anteater revealed 23 bugs, including forwarding loops and stale ACL rules, with only five false positives. Nine of these faults are being fixed by campus network operators.

[1]  Nick Feamster,et al.  Detecting BGP configuration faults with static analysis , 2005 .

[2]  David Evans,et al.  N-Variant Systems: A Secretless Framework for Security through Diversity , 2006, USENIX Security Symposium.

[3]  Ratul Mahajan,et al.  Measuring ISP topologies with Rocketfuel , 2004, IEEE/ACM Transactions on Networking.

[4]  Ehab Al-Shaer,et al.  Discovery of policy anomalies in distributed firewalls , 2004, IEEE INFOCOM 2004.

[5]  Patrick D. McDaniel,et al.  Working around BGP: An Incremental Approach to Improving Security and Accuracy in Interdomain Routing , 2003, NDSS.

[6]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[7]  Armin Biere,et al.  Symbolic Model Checking without BDDs , 1999, TACAS.

[8]  Chen-Nee Chuah,et al.  FIREMAN: a toolkit for firewall modeling and analysis , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[9]  Harish Sethu,et al.  On achieving software diversity for improved network security using distributed coloring algorithms , 2004, CCS '04.

[10]  Randy Bush,et al.  Integrity for virtual private routed networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[11]  Dino Farinacci,et al.  Generic Routing Encapsulation (GRE) , 2000, RFC.

[12]  Ratul Mahajan,et al.  Understanding BGP misconfiguration , 2002, SIGCOMM '02.

[13]  Armin Biere,et al.  Boolector: An Efficient SMT Solver for Bit-Vectors and Arrays , 2009, TACAS.

[14]  George Varghese,et al.  Difference engine , 2010, OSDI.

[15]  Ramesh Govindan,et al.  ASTUTE: detecting a different class of traffic anomalies , 2010, SIGCOMM '10.

[16]  Lan Wang,et al.  FRTR: A Scalable Mechanism to Restore Routing Table Consistency , 2004 .

[17]  Darryl Veitch,et al.  Towards optimal sampling for flow size estimation , 2008, IMC '08.

[18]  J. Rexford,et al.  Network-Wide Prediction of BGP Routes , 2007, IEEE/ACM Transactions on Networking.

[19]  Alexander Aiken,et al.  Saturn: A scalable framework for error detection using Boolean satisfiability , 2007, TOPL.

[20]  Athina Markopoulou,et al.  Characterization of failures in an IP backbone , 2004, IEEE INFOCOM 2004.

[21]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[22]  Geoffrey M. Voelker,et al.  Surviving Internet Catastrophes , 2005, USENIX Annual Technical Conference, General Track.

[23]  Ding Yuan,et al.  SherLog: error diagnosis by connecting clues from run-time logs , 2010, ASPLOS XV.

[24]  Ye Wang,et al.  Shadow configuration as a network management primitive , 2008, SIGCOMM '08.

[25]  Avishai Wool,et al.  Firmato: A novel firewall management toolkit , 2004, TOCS.

[26]  Nick Feamster,et al.  Design and implementation of a routing control platform , 2005, NSDI.

[27]  Xuezheng Liu,et al.  D3S: Debugging Deployed Distributed Systems , 2008, NSDI.

[28]  Jia Wang,et al.  Would Diversity Really Increase the Robustness of the Routing Infrastructure against Software Defects? , 2008, NDSS.

[29]  Emery D. Berger,et al.  DieHard: probabilistic memory safety for unsafe languages , 2006, PLDI '06.

[30]  Scott Shenker,et al.  Achieving convergence-free routing using failure-carrying packets , 2007, SIGCOMM '07.

[31]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[32]  Vishal Misra,et al.  Theoretical bounds on control-plane self-monitoring in routing protocols , 2007, SIGMETRICS '07.

[33]  Chen-Nee Chuah,et al.  Impact of BGP Dynamics on Router CPU Utilization , 2004, PAM.

[34]  Scott Shenker,et al.  Diverse Replication for Single-Machine Byzantine-Fault Tolerance , 2008, USENIX Annual Technical Conference.

[35]  Santosh S. Vempala,et al.  Path splicing , 2008, SIGCOMM '08.

[36]  Zhuoqing Morley Mao,et al.  Accurate Real-time Identification of IP Prefix Hijacking , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[37]  Renata Teixeira,et al.  Characterizing network events and their impact on routing , 2007, CoNEXT '07.

[38]  Seungjoon Lee,et al.  Adaptive parsing of router configuration languages , 2008, 2008 IEEE Internet Network Management Workshop (INM).

[39]  Ion Stoica,et al.  Friday: Global Comprehension for Distributed Replay , 2007, NSDI.

[40]  David A. Maltz,et al.  Unraveling the Complexity of Network Management , 2009, NSDI.

[41]  Jia Wang,et al.  Finding a needle in a haystack: pinpointing significant BGP routing changes in an IP network , 2005, NSDI.

[42]  Landon P. Cox,et al.  TightLip: Keeping Applications from Spilling the Beans , 2007, NSDI.

[43]  Jia Wang,et al.  Scalable and accurate identification of AS-level forwarding paths , 2004, IEEE INFOCOM 2004.

[44]  Zuoning Yin,et al.  Towards understanding bugs in open source router software , 2010, CCRV.

[45]  Ehab Al-Shaer,et al.  Modeling and verification of IPSec and VPN security policies , 2005, 13TH IEEE International Conference on Network Protocols (ICNP'05).

[46]  Nancy G. Leveson,et al.  A reply to the criticisms of the Knight & Leveson experiment , 1990, SOEN.

[47]  Robert M. Hinden Virtual Router Redundancy Protocol (VRRP) , 2004, RFC.

[48]  Jennifer Rexford,et al.  Building bug-tolerant routers with virtualization , 2008, PRESTO '08.

[49]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[50]  François Baccelli,et al.  The Role of PASTA in Network Measurement , 2006, IEEE/ACM Transactions on Networking.

[51]  Anja Feldmann,et al.  Locating internet routing instabilities , 2004, SIGCOMM '04.

[52]  Vach Kompella,et al.  Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling , 2007, RFC.

[53]  Albert G. Greenberg,et al.  On static reachability analysis of IP networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[54]  Timothy Roscoe,et al.  Predicate routing: enabling controlled networking , 2003, CCRV.

[55]  Harrick M. Vin,et al.  Heterogeneous networking: a new survivability paradigm , 2001, NSPW '01.