Space efficient deep packet inspection of compressed web traffic

In this paper we focus on the process of deep packet inspection of compressed web traffic. The major limiting factor in this process imposed by the compression, is the high memory requirements of 32KB per connection. This leads to the requirements of hundreds of megabytes to gigabytes of main memory on a multi-connection setting. We introduce new algorithms and techniques that drastically reduce this space requirement for such bump-in-the-wire devices like security and other content based networking tools. Our proposed scheme improves both space and time performance by almost 80% and over 40% respectively, thus making real-time compressed traffic inspection a viable option for networking devices.

[1]  David A. Huffman,et al.  A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.

[2]  Peter Deutsch,et al.  GZIP file format specification version 4.3 , 1996, RFC.

[3]  Anat Bremler-Barr,et al.  Shift-based pattern matching for compressed web traffic , 2011, 2011 IEEE 12th International Conference on High Performance Switching and Routing.

[4]  Wojciech Plandowski,et al.  Efficient Algorithms for Lempel-Zip Encoding (Extended Abstract) , 1996, SWAT.

[5]  Gonzalo Navarro,et al.  A General Practical Approach to Pattern Matching over Ziv-Lempel Compressed Text , 1999, CPM.

[6]  Dionisios N. Pnevmatikatos,et al.  A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems , 2007, 2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[7]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[8]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[9]  Ayumi Shinohara,et al.  Shift-And Approach to Pattern Matching in LZW Compressed Text , 1999, CPM.

[10]  Gary Benson,et al.  Let sleeping files lie: pattern matching in Z-compressed files , 1994, SODA '94.

[11]  Gonzalo Navarro,et al.  Boyer-Moore String Matching over Ziv-Lempel Compressed Text , 2000, CPM.

[12]  Vijay Kumar,et al.  High Speed Pattern Matching for Network IDS/IPS , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[13]  Udi Manber,et al.  A FAST ALGORITHM FOR MULTI-PATTERN SEARCHING , 1999 .

[14]  Wei Lin,et al.  Pipelined Parallel AC-Based Approach for Multi-String Matching , 2008, 2008 14th IEEE International Conference on Parallel and Distributed Systems.

[15]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[16]  Wei Zhang,et al.  A Memory Efficient Multiple Pattern Matching Architecture for Network Security , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[17]  George Varghese,et al.  Deterministic memory-efficient string matching algorithms for intrusion detection , 2004, IEEE INFOCOM 2004.

[18]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[19]  Yehuda Afek,et al.  Efficient Processing of Multi-connection Compressed Web Traffic , 2011, Networking.

[20]  Shmuel Tomi Klein,et al.  A new compression method for compressed matching , 2000, Proceedings DCC 2000. Data Compression Conference.

[21]  Anat Bremler-Barr,et al.  Accelerating Multipattern Matching on Compressed HTTP Traffic , 2012, IEEE/ACM Transactions on Networking.

[22]  Jan van Lunteren,et al.  High-Performance Pattern-Matching for Intrusion Detection , 2006, INFOCOM.

[23]  Timothy Sherwood,et al.  Architectures for Bit-Split String Scanning in Intrusion Detection , 2006, IEEE Micro.

[24]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[25]  Mikkel Thorup,et al.  String Matching in Lempel—Ziv Compressed Strings , 1998, Algorithmica.