ASAP: Automatic Semantics-Aware Analysis of Network Payloads

Automatic inspection of network payloads is a prerequisite for effective analysis of network communication. Security research has largely focused on network analysis using protocol specifications, for example for intrusion detection, fuzz testing and forensic analysis. The specification of a protocol alone, however, is often not sufficient for accurate analysis of communication, as it fails to reflect individual semantics of network applications. We propose a framework for semantics-aware analysis of network payloads which automatically extracts semanticsaware components from recorded network traffic. Our method proceeds by mapping network payloads to a vector space and identifying communication templates corresponding to base directions in the vector space. We demonstrate the efficacy of semantics-aware analysis in different security applications: automatic discovery of patterns in honeypot data, analysis of malware communication and network intrusion detection.

[1]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[2]  Christopher Krügel,et al.  Leveraging User Interactions for In-Depth Testing of Web Applications , 2008, RAID.

[3]  Christopher Krügel,et al.  Automatic Network Protocol Analysis , 2008, NDSS.

[4]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[5]  M Damashek,et al.  Gauging Similarity with n-Grams: Language-Independent Categorization of Text , 1995, Science.

[6]  Marc Dacier,et al.  Automatic Handling of Protocol Dependencies and Reaction to 0-Day Attacks with ScriptGen Based Honeypots , 2006, RAID.

[7]  Konrad Rieck,et al.  Linear-Time Computation of Similarity Measures for Sequential Data , 2008, J. Mach. Learn. Res..

[8]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[9]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[10]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Worm Detection and Signature Generation , 2005, RAID.

[11]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[12]  Xiangliang Zhang,et al.  Fast intrusion detection based on a non-negative matrix factorization model , 2009, J. Netw. Comput. Appl..

[13]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[14]  Marc Dacier,et al.  ScriptGen: an automated script generation tool for Honeyd , 2005, 21st Annual Computer Security Applications Conference (ACSAC'05).

[15]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[16]  David Brumley,et al.  Replayer: automatic protocol replay by binary analysis , 2006, CCS '06.

[17]  Christopher Krügel,et al.  Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[18]  Alfred O. Hero,et al.  Manifold learning visualization of network traffic data , 2005, MineNet '05.

[19]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[20]  Vern Paxson,et al.  A high-level programming environment for packet trace anonymization and transformation , 2003, SIGCOMM '03.

[21]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[22]  Konrad Rieck,et al.  TokDoc: a self-healing web application firewall , 2010, SAC '10.

[23]  Konrad Rieck,et al.  Detecting Unknown Network Attacks Using Language Models , 2006, DIMVA.

[24]  Konrad Rieck,et al.  Botzilla: detecting the "phoning home" of malicious software , 2010, SAC '10.

[25]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[26]  John McHugh,et al.  The Contact Surface: A Technique for Exploring Internet Scale Emergent Behaviors , 2008, DIMVA.

[27]  Giovanni Vigna,et al.  NetSTAT: A Network-based Intrusion Detection System , 1999, J. Comput. Secur..

[28]  Felix C. Freiling,et al.  The Nepenthes Platform: An Efficient Approach to Collect Malware , 2006, RAID.

[29]  Salvatore J. Stolfo,et al.  Anagram: A Content Anomaly Detector Resistant to Mimicry Attack , 2006, RAID.

[30]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[31]  Xiangliang Zhang,et al.  Constructing attribute weights from computer audit data for effective intrusion detection , 2009, J. Syst. Softw..

[32]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[33]  David Moore,et al.  Code-Red: a case study on the spread and victims of an internet worm , 2002, IMW '02.

[34]  Randy H. Katz,et al.  Protocol-Independent Adaptive Replay of Application Dialog , 2006, NDSS.

[35]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[36]  Radu State,et al.  Advanced fuzzing in the VoIP space , 2010, Journal in Computer Virology.

[37]  Pavel Laskov,et al.  Detection of Intrusions and Malware, and Vulnerability Assessment: 19th International Conference, DIMVA 2022, Cagliari, Italy, June 29 –July 1, 2022, Proceedings , 2022, International Conference on Detection of intrusions and malware, and vulnerability assessment.

[38]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Network Intrusion Detection , 2004, RAID.

[39]  Christopher Krügel,et al.  Dynamic Analysis of Malicious Code , 2006, Journal in Computer Virology.

[40]  Shaoying Liu,et al.  Generating test data from state‐based specifications , 2003, Softw. Test. Verification Reliab..