Zero-Day Signature Extraction for High-Volume Attacks

We present a basic tool for zero day attack signature extraction. Given two large sets of messages, <inline-formula> <tex-math notation="LaTeX">$P$ </tex-math></inline-formula> the messages captured in the network at peacetime (i.e., mostly legitimate traffic) and <inline-formula> <tex-math notation="LaTeX">$A$ </tex-math></inline-formula> the messages captured during attack time (i.e., contains many attack messages), we present a tool for extracting a set <inline-formula> <tex-math notation="LaTeX">$S$ </tex-math></inline-formula> of strings that are frequently found in <inline-formula> <tex-math notation="LaTeX">$A$ </tex-math></inline-formula> and not in <inline-formula> <tex-math notation="LaTeX">$P$ </tex-math></inline-formula>, thus allowing the identification of the attack packets. This is an important tool in protecting sites on the Internet from worm attacks and distributed denial of service attacks and may also be useful for other problems, including command and control identification and the DNA-sequences analysis. The main contributions of this paper are the system we developed to extract the required signatures together with the string-heavy hitters problem definition and the algorithm for solving this problem. This algorithm finds popular strings of variable length in a set of messages, using, in a tricky way, the classic heavy-hitter algorithm as a building block. The algorithm runs in linear time requiring one-pass over the input. Our system makes use of this algorithm to extract the desired signatures. Furthermore, we provide an extended algorithm which is able to identify groups of signatures, often found together in the same packets, which further improves the quality of signatures generated by our system. Using our system, a yet unknown attack can be detected and stopped within minutes from attack start time.

[1]  Atri Rudra,et al.  ℓ2/ℓ2-Foreach Sparse Recovery with Low Risk , 2013, ICALP.

[2]  rey O. Kephart,et al.  Automatic Extraction of Computer Virus SignaturesJe , 2006 .

[3]  Catherine Rosenberg,et al.  Compressed Data Aggregation: Energy-Efficient and High-Fidelity Data Collection , 2013, IEEE/ACM Transactions on Networking.

[4]  Sanjeev Khanna,et al.  Space-efficient online computation of quantile summaries , 2001, SIGMOD '01.

[5]  K. V. V. Durga Prasad,et al.  Compressed sensing for different sensors: A real scenario for WSN and IoT , 2016, 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT).

[6]  Juan Caballero,et al.  FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors , 2013, RAID.

[7]  Graham Cormode,et al.  An Improved Data Stream Summary: The Count-Min Sketch and Its Applications , 2004, LATIN.

[8]  F. Richard Yu,et al.  Software-Defined Networking (SDN) and Distributed Denial of Service (DDoS) Attacks in Cloud Computing Environments: A Survey, Some Research Issues, and Challenges , 2016, IEEE Communications Surveys & Tutorials.

[9]  Antonio Nucci,et al.  Towards self adaptive network traffic classification , 2015, Comput. Commun..

[10]  Ely Porat,et al.  Sublinear time, measurement-optimal, sparse recovery for all , 2012, SODA.

[11]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[12]  James Newsome,et al.  Polygraph: automatically generating signatures for polymorphic worms , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[13]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[14]  Yao Zheng,et al.  DDoS Attack Protection in the Era of Cloud Computing and Software-Defined Networking , 2014, 2014 IEEE 22nd International Conference on Network Protocols.

[15]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[16]  Yong Tang,et al.  Defending against Internet worms: a signature-based approach , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[17]  Hyundo Park,et al.  Distinguishing between FE and DDoS Using Randomness Check , 2008, ISC.

[18]  Atri Rudra,et al.  Efficiently Decodable Compressed Sensing by List-Recoverable Codes and Recursion , 2012, STACS.

[19]  Divesh Srivastava,et al.  Finding Hierarchical Heavy Hitters in Data Streams , 2003, VLDB.

[20]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[21]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[22]  B. Karp,et al.  Autograph: Toward Automated, Distributed Worm Signature Detection , 2004, USENIX Security Symposium.

[23]  Matthew V. Mahoney,et al.  Network traffic anomaly detection based on packet bytes , 2003, SAC '03.

[24]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[25]  Ming-Yang Kao,et al.  Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[26]  Roy Friedman,et al.  Heavy hitters in streams and sliding windows , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[27]  Roy Friedman,et al.  Randomized admission policy for efficient top-k and frequency estimation , 2016, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[28]  Christopher Krügel,et al.  Extracting probable command and control signatures for detecting botnets , 2014, SAC.

[29]  George Varghese,et al.  Automated Worm Fingerprinting , 2004, OSDI.

[30]  V. Sangeetha,et al.  Entropy based Anomaly Detection System to Prevent DDoS Attacks in Cloud , 2013, ArXiv.

[31]  Yehuda Afek,et al.  Automated signature extraction for high volume attacks , 2013, Architectures for Networking and Communications Systems.

[32]  Jon Crowcroft,et al.  Honeycomb , 2004, Comput. Commun. Rev..

[33]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[34]  Vyas Sekar,et al.  An empirical evaluation of entropy-based traffic anomaly detection , 2008, IMC '08.

[35]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Network Intrusion Detection , 2004, RAID.

[36]  David Mazières,et al.  A low-bandwidth network file system , 2001, SOSP.

[37]  Saman Taghavi Zargar,et al.  A Survey of Defense Mechanisms Against Distributed Denial of Service (DDoS) Flooding Attacks , 2013, IEEE Communications Surveys & Tutorials.