Payload Attribution via Character Dependent Multi-Bloom Filters

Network forensic analysts employ payload attribution systems (PAS) as an investigative tool, which enables them to store and summarize large amounts of network traffic, including full packet payload. Hence an investigator could query the system for a specific string and check whether any of the packets transmitted previously in the network contained that specific string. As a shortcoming, the previously proposed techniques are unable to support wildcard queries. Wildcards are an important type of query that allow the investigator to locate strings in the payload when only part of the string is known. In this paper, a new data structure for payload attribution, named Character Dependent Multi-Bloom Filters, will be presented which, in addition to improving the previously proposed techniques, is able to support wildcard queries as well. To this end, a theoretical study of the proposed method was conducted in order to evaluate its false positive when responding to queries and subsequently the theoretical analysis is verified through a number of experiments. Furthermore, comparisons are made between the proposed method and the state-of-the-art attribution techniques presented in the literature. The results suggest that, using the Character Dependent Multi-Bloom Filters, one can obtain a data reduction ratio of about 265 : 1 opposed to 210 : 1 as obtained by the previously proposed state-of-the-art techniques assuming a similar false-positive rate. More importantly, the results indicate that a wildcard query with seven unknown characters would take approximately less than 1 second to process, using the proposed method; while given the previous techniques, as an exhaustive search is required, the same query takes about 4500 years to process.

[1]  Hervé Brönnimann,et al.  Highly efficient techniques for network forensics , 2007, CCS '07.

[2]  Jinwoo Kim,et al.  Session Based Logging (SBL) for IP-Traceback on Network Forensics , 2006, Security and Management.

[3]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[4]  Sin Yeung Lee,et al.  Network Forensics on Packet Fingerprints , 2006, SEC.

[5]  Michalis Polychronakis,et al.  An Empirical Study of Real-world Polymorphic Code Injection Attacks , 2009, LEET.

[6]  Hervé Brönnimann,et al.  New payload attribution methods for network forensic investigations , 2010, TSEC.

[7]  Wenke Lee,et al.  Polymorphic Blending Attacks , 2006, USENIX Security Symposium.

[8]  H BloomBurton Space/time trade-offs in hash coding with allowable errors , 1970 .

[9]  Nasir D. Memon,et al.  Payload attribution via hierarchical bloom filters , 2004, CCS '04.

[10]  Ming-Yang Kao,et al.  Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[11]  Rajdeep Niyogi,et al.  Network forensic frameworks: Survey and research challenges , 2010, Digit. Investig..

[12]  James Newsome,et al.  Polygraph: automatically generating signatures for polymorphic worms , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[13]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[14]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[15]  Kai Rannenberg,et al.  Security and Privacy in Dynamic Environments , 2006 .

[16]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[17]  Alex C. Snoeren,et al.  Hash-based IP traceback , 2001, SIGCOMM '01.