论文信息 - Learning stateful models for network honeypots - 字舞流文

Learning stateful models for network honeypots

Attacks like call fraud and identity theft often involve sophisticated stateful attack patterns which, on top of normal communication, try to harm systems on a higher semantic level than usual attack scenarios. To detect these kind of threats via specially deployed honeypots, at least a minimal understanding of the inherent state machine of a specific service is needed to lure potential attackers and to keep a communication for a sufficiently large number of steps. To this end we propose PRISMA, a method for protocol inspection and state machine analysis, which infers a functional state machine and message format of a protocol from network traffic alone. We apply our method to three real-life network traces ranging from 10,000 up to 2 million messages of both binary and textual protocols. We show that PRISMA is capable of simulating complete and correct sessions based on the learned models. A case study on malware traffic reveals the different states of the execution, rendering PRISMA a valuable tool for malware analysis.

Nicole Krämer | Konrad Rieck | Hugo Gascon | Tammo Krueger | K. Rieck | T. Krueger | Hugo Gascon | Nicole Krämer

[1] Edward F. Moore,et al. Gedanken-Experiments on Sequential Machines , 1956 .

[2] Vern Paxson,et al. A high-level programming environment for packet trace anonymization and transformation , 2003, SIGCOMM '03.

[3] Jon Postel,et al. File Transfer Protocol , 1985, RFC.

[4] Marc Dacier,et al. Automatic Handling of Protocol Dependencies and Reaction to 0-Day Attacks with ScriptGen Based Honeypots , 2006, RAID.

[5] Paul Hethmon. Extensions to FTP , 2007, RFC.

[6] Christopher Krügel,et al. JACKSTRAWS: Picking Command and Control Connections from Bot Traffic , 2011, USENIX Security Symposium.

[7] Dan Wing,et al. The SIP Identity Baiting Attack , 2008 .

[8] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[9] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[10] Konrad Rieck,et al. Linear-Time Computation of Similarity Measures for Sequential Data , 2008, J. Mach. Learn. Res..

[11] L. Baum,et al. An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[12] Zhenkai Liang,et al. Polyglot: automatic extraction of protocol message format using dynamic binary analysis , 2007, CCS '07.

[13] Christopher Krügel,et al. Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[14] Christoph Schnörr,et al. Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming , 2006, J. Mach. Learn. Res..

[15] Xuxian Jiang,et al. Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution , 2008, NDSS.

[16] Helen J. Wang,et al. Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[17] Randy H. Katz,et al. Protocol-Independent Adaptive Replay of Application Dialog , 2006, NDSS.

[18] Humberto Abdelnur,et al. SIP digest authentication relay attack , 2009 .

[19] P. Paatero,et al. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[20] Nicole Krämer,et al. ASAP: Automatic Semantics-Aware Analysis of Network Payloads , 2010, PSDML.

[21] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .

[22] Zhi Wang,et al. ReFormat: Automatic Reverse Engineering of Encrypted Messages , 2009, ESORICS.

[23] Robert Elz,et al. Feature negotiation mechanism for the File Transfer Protocol , 1998, RFC.

[24] David Brumley,et al. Replayer: automatic protocol replay by binary analysis , 2006, CCS '06.

[25] David Mankins,et al. Directory oriented FTP commands , 1980, RFC.

[26] P. Gács,et al. Algorithms , 1992 .

[27] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[28] George Varghese,et al. Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications , 2001, SIGCOMM 2001.

[29] Marc Dacier,et al. ScriptGen: an automated script generation tool for Honeyd , 2005, 21st Annual Computer Security Applications Conference (ACSAC'05).

[30] Dawn Xiaodong Song,et al. Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering , 2009, CCS.

[31] A. Fraser. Hidden Markov Models and Dynamical Systems , 2011 .

[32] Amy Nicole Langville,et al. Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.

[33] P. Holland. Weighted Ridge Regression: Combining Ridge and Robust Regression Methods , 1973 .

[34] Helen J. Wang,et al. Tupni: automatic reverse engineering of input formats , 2008, CCS.

[35] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[36] Christopher Krügel,et al. Automatic Network Protocol Analysis , 2008, NDSS.