Transparent Web Service Auditing via Network Provenance Functions

Detecting and explaining the nature of attacks in distributed web services is often difficult -- determining the nature of suspicious activity requires following the trail of an attacker through a chain of heterogeneous software components including load balancers, proxies, worker nodes, and storage services. Unfortunately, existing forensic solutions cannot provide the necessary context to link events across complex workflows, particularly in instances where application layer semantics (e.g., SQL queries, RPCs) are needed to understand the attack. In this work, we present a transparent provenance-based approach for auditing web services through the introduction of Network Provenance Functions (NPFs). NPFs are a distributed architecture for capturing detailed data provenance for web service components, leveraging the key insight that mediation of an application's protocols can be used to infer its activities without requiring invasive instrumentation or developer cooperation. We design and implement NPF with consideration for the complexity of modern cloud-based web services, and evaluate our architecture against a variety of applications including DVDStore, RUBiS, and WikiBench to show that our system imposes as little as 9.3% average end-to-end overhead on connections for realistic workloads. Finally, we consider several scenarios in which our system can be used to concisely explain attacks. NPF thus enables the hassle-free deployment of semantically rich provenance-based auditing for complex applications workflows in the Cloud.

[1]  Margo I. Seltzer,et al.  Layering in Provenance Systems , 2009, USENIX Annual Technical Conference.

[2]  Marianne Winslett,et al.  The Case of the Fake Picasso: Preventing History Forgery with Secure Provenance , 2009, FAST.

[3]  Xiangyu Zhang,et al.  High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[4]  References , 1971 .

[5]  Alessandro Orso,et al.  AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks , 2005, ASE.

[6]  Margo I. Seltzer,et al.  A General-Purpose Provenance Library , 2012, TaPP.

[7]  Andreas Haeberlen,et al.  Secure network provenance , 2011, SOSP.

[8]  Patrick D. McDaniel,et al.  Hi-Fi: collecting high-fidelity whole-system provenance , 2012, ACSAC '12.

[9]  Fareed Zaffar,et al.  Fine-grained tracking of Grid infections , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[10]  Fareed Zaffar,et al.  Identifying the provenance of correlated anomalies , 2011, SAC '11.

[11]  Jaehong Park,et al.  A provenance-based access control model , 2012, 2012 Tenth Annual International Conference on Privacy, Security and Trust.

[12]  Yong Guan,et al.  Detection of stepping stone attack under delay and chaff perturbations , 2006, 2006 IEEE International Performance Computing and Communications Conference.

[13]  Imad M. Abbadi,et al.  Challenges for Provenance in Cloud Computing , 2011, TaPP.

[14]  Xiangyu Zhang,et al.  ProTracer: Towards Practical Provenance Tracing by Alternating Between Logging and Tainting , 2016, NDSS.

[15]  Willy Zwaenepoel,et al.  Performance and scalability of EJB applications , 2002, OOPSLA '02.

[16]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[17]  Paul T. Groth,et al.  Trade-Offs in Automatic Provenance Capture , 2016, IPAW.

[18]  Kevin R. B. Butler,et al.  Towards secure provenance-based access control in cloud environments , 2013, CODASPY.

[19]  Andreas Haeberlen,et al.  Let SDN Be Your Eyes: Secure Forensics in Data Center Networks , 2014 .

[20]  Ashish Gehani,et al.  SPADE: Support for Provenance Auditing in Distributed Environments , 2012, Middleware.

[21]  Ethan L. Miller,et al.  Tracking Emigrant Data via Transient Provenance , 2011, TaPP.

[22]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[23]  Wang Chiew Tan,et al.  DBNotes: a post-it system for relational databases based on provenance , 2005, SIGMOD '05.

[24]  Ulf Lindqvist,et al.  Bonsai: Balanced Lineage Authentication , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[25]  Keqiang He,et al.  Next stop, the cloud: understanding modern web service deployment in EC2 and azure , 2013, Internet Measurement Conference.

[26]  Luc Moreau,et al.  PROV-Overview. An Overview of the PROV Family of Documents , 2013 .

[27]  Val Tannen,et al.  Collaborative data sharing via update exchange and provenance , 2013, TODS.