Data De-duplication and Event Processing for Security Applications on an Embedded Processor

Network security schemes generally deploy sensors and other network devices which generate huge volumes of data, overwhelming the underlying decision making algorithms. An example is corporate networks employing intrusion detection systems where there is a deluge of alert data, confounding the computations involved in sensor information fusion and alert correlation. One way to obtain fast and real-time responses is to preprocess such data to manageable sizes. In this paper, we show that data de-duplication using computationally efficient fingerprinting algorithms can provide real-time results. We present an algorithm which utilizes Rabin Fingerprinting/hashing scheme for the purpose of data de-duplication. We have implemented this algorithm on Intel Atom, which is a powerful, energy efficient embedded processor. Our study is intended to show that the relatively low performing embedded processors are capable of providing the needed computational support if they were to handle security functions in the field. When compared to the algorithmic performance on a high end system, viz. Intel Core 2 Duo processor, the positive results obtained make a case for using the Atom processor in networked applications employing mobile devices.

[1]  B. Karp,et al.  Autograph: Toward Automated, Distributed Worm Signature Detection , 2004, USENIX Security Symposium.

[2]  George Varghese,et al.  Automated Worm Fingerprinting , 2004, OSDI.

[3]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[4]  Panagiotis Papadimitratos,et al.  Secure vehicular communication systems: design and architecture , 2008, IEEE Communications Magazine.

[5]  Fred Douglis,et al.  Redundancy Elimination Within Large Collections of Files , 2004, USENIX Annual Technical Conference, General Track.

[6]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[7]  Peter Deutsch,et al.  ZLIB Compressed Data Format Specification version 3.3 , 1996, RFC.

[8]  Jeffrey Harr,et al.  Building Blocks , 2013 .

[9]  F. Lemmermeyer Error-correcting Codes , 2005 .

[10]  R. Stapleton-Gray,et al.  Rendering the Elephant: Characterizing Sensitive Networks for an Uncleared Audience , 2006, 2006 IEEE Information Assurance Workshop.

[11]  Adam Stotz,et al.  Situation Awareness of multistage cyber attacks by semantic event fusion , 2010, 2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE.

[12]  Stefanos Manganaris,et al.  A Data Mining Analysis of RTID Alarms , 2000, Recent Advances in Intrusion Detection.

[13]  Calvin Chan,et al.  CMPUT690 Term Project Fingerprinting using Polynomial (Rabin's method) , 2001 .

[14]  W. W. Peterson,et al.  Error-Correcting Codes. , 1962 .