Efficient querying and maintenance of network provenance at internet-scale

Network accountability, forensic analysis, and failure diagnosis are becoming increasingly important for network management and security. Such capabilities often utilize network provenance - the ability to issue queries over network meta-data. For example, network provenance may be used to trace the path a message traverses on the network as well as to determine how message data were derived and which parties were involved in its derivation. This paper presents the design and implementation of ExSPAN, a generic and extensible framework that achieves efficient network provenance in a distributed environment. We utilize the database notion of data provenance to "explain" the existence of any network state, providing a versatile mechanism for network provenance. To achieve such flexibility at Internet-scale, ExSPAN uses declarative networking in which network protocols can be modeled as continuous queries over distributed streams and specified concisely in a declarative query language. We extend existing data models for provenance developed in database literature to enable distribution at Internet-scale, and investigate numerous optimization techniques to maintain and query distributed network provenance efficiently. The ExSPAN prototype is developed using RapidNet, a declarative networking platform based on the emerging ns-3 toolkit. Experiments over a simulated network and an actual deployment in a testbed environment demonstrate that our system supports a wide range of distributed provenance computations efficiently, resulting in significant reductions in bandwidth costs compared to traditional approaches.

[1]  Randal E. Bryant,et al.  Symbolic Boolean manipulation with ordered binary-decision diagrams , 1992, CSUR.

[2]  Vyas Sekar,et al.  Forensic Analysis for Epidemic Attacks in Federated Networks , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[3]  Atul Singh,et al.  Using queries for distributed monitoring and forensics , 2006, EuroSys.

[4]  John C.-I. Chuang,et al.  Network monitors and contracting systems: competition and innovation , 2006, SIGCOMM.

[5]  Boon Thau Loo,et al.  Recursive Computation of Regions and Connectivity in Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[6]  Zachary G. Ives,et al.  ORCHESTRA: Rapid, Collaborative Sharing of Dynamic Data , 2005, CIDR.

[7]  Boon Thau Loo,et al.  Declarative Toolkit for Rapid Network Protocol Simulation and Experimentation , 2009 .

[8]  Scott Shenker,et al.  Providing Packet Obituaries , 2004 .

[9]  Boon Thau Loo,et al.  Provenance-aware secure networks , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[10]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[11]  Anja Feldmann,et al.  Building a time machine for efficient recording and retrieval of high-volume network traffic , 2005, IMC '05.

[12]  Larry L. Peterson,et al.  PlanetFlow: maintaining accountability for network services , 2006, OPSR.

[13]  Xuezheng Liu,et al.  D3S: Debugging Deployed Distributed Systems , 2008, NSDI.

[14]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[15]  Jeffrey D. Ullman,et al.  A Survey of Research in Deductive Database Systems , 1995 .

[16]  Anna R. Karlin,et al.  Practical network support for IP traceback , 2000, SIGCOMM.

[17]  Dan Suciu,et al.  Adding Structure to Unstructured Data , 1997, ICDT.

[18]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[19]  Ion Stoica,et al.  Implementing declarative overlays , 2005, SOSP '05.

[20]  Val Tannen,et al.  Update Exchange with Mappings and Provenance , 2007, VLDB.

[21]  Ion Stoica,et al.  Declarative routing: extensible routing with declarative queries , 2005, SIGCOMM '05.

[22]  Gustavo Alonso,et al.  Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[23]  Ion Stoica,et al.  Declarative networking: language, execution and optimization , 2006, SIGMOD Conference.

[24]  Xiaozhou Li,et al.  RapidMesh: declarative toolkit for rapid experimentation of wireless mesh networks , 2009, WINTECH '09.