Combining static analysis and dynamic learning to build context sensitive models of program behavior

This dissertation describes a family of models of program behavior, the Hybrid Push Down Automata (HPDA) that can be acquired using a combination of static analysis and dynamic learning in order to take advantage of the strengths of both. Static analysis is used to acquire a base model of all behavior defined in the binary source code. Dynamic learning from audit data is used to supplement the base model to provide a model that exactly follows the definition in the executable but that includes legal behavior determined at runtime. Our model is similar to the VPStatic model proposed by Feng, Giffin, et al., but with different assumptions and organization. Return address information extracted from the program call stack and system call information are used to build the model. Dynamic learning alone or a combination of static analysis and dynamic learning can be used to acquire the model. We have shown that a new dynamic learning algorithm based on the assumption of a single entry point and exit point for each function can yield models of increased generality and can help reduce the false positive rate. Previous approaches based on static analysis typically work only with statically linked programs. We have developed a new component-based model and learning algorithm that builds separate models for dynamic libraries used in a program allowing the models to be shared by different program models. Sharing of models reduces memory usage when several programs are monitored, promotes reuse of library models, and simplifies model maintenance when the system updates dynamic libraries. Experiments demonstrate that the prototype detection system built with the HPDA approach has a performance overhead of less than 6% and can be used with complex real-world applications. When compared to other detection systems based on analysis of operating system calls, the HPDA approach is shown to converge faster during learning, to detect attacks that escape other detection systems, and to have a lower false positive rate.

[1]  Alec Wolman,et al.  Instrumentation and optimization of Win32/intel executables using Etch , 1997 .

[2]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[3]  David A. Wagner,et al.  Intrusion detection via static analysis , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[4]  A. One,et al.  Smashing The Stack For Fun And Profit , 1996 .

[5]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[6]  Christopher Krügel,et al.  Anomaly detection of web-based attacks , 2003, CCS '03.

[7]  David A. Wagner,et al.  Model Checking One Million Lines of C Code , 2004, NDSS.

[8]  Weibo Gong,et al.  Anomaly detection using call stack information , 2003, 2003 Symposium on Security and Privacy, 2003..

[9]  Crispan Cowan,et al.  StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks , 1998, USENIX Security Symposium.

[10]  David H. Ackley,et al.  Building diverse computer systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[11]  Niels Provos,et al.  Improving Host Security with System Call Policies , 2003, USENIX Security Symposium.

[12]  Debin Gao,et al.  On Gray-Box Program Tracking for Anomaly Detection , 2004, USENIX Security Symposium.

[13]  Kymie M. C. Tan,et al.  "Why 6?" Defining the operational limits of stide, an anomaly-based intrusion detector , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[14]  Zhen Liu,et al.  Combining static analysis and dynamic learning to build accurate intrusion detection models , 2005, Third IEEE International Workshop on Information Assurance (IWIA'05).

[15]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[16]  Daniel C. DuVarney,et al.  Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits , 2003, USENIX Security Symposium.

[17]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[18]  John Wilander,et al.  A Comparison of Publicly Available Tools for Dynamic Buffer Overflow Prevention , 2003, NDSS.

[19]  Sriram K. Rajamani,et al.  Automatically validating temporal safety properties of interfaces , 2001, SPIN '01.

[20]  Anup K. Ghosh,et al.  A Study in Using Neural Networks for Anomaly and Misuse Detection , 1999, USENIX Security Symposium.

[21]  Steven M. Bellovin Computer security—an end state? , 2001, CACM.

[22]  Angelos D. Keromytis,et al.  Countering code-injection attacks with instruction-set randomization , 2003, CCS '03.

[23]  Tzi-cker Chiueh,et al.  A Binary Rewriting Defense Against Stack based Buffer Overflow Attacks , 2003, USENIX Annual Technical Conference, General Track.

[24]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[25]  Somesh Jha,et al.  Efficient Context-Sensitive Intrusion Detection , 2004, NDSS.

[26]  Wenliang Du,et al.  Context Sensitive Anomaly Monitoring of Process Control Flow to Detect Mimicry Attacks and Impossible Paths , 2004, RAID.

[27]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[28]  David A. Wagner,et al.  Mimicry attacks on host-based intrusion detection systems , 2002, CCS '02.

[29]  Salim Hariri,et al.  Randomized Instruction Set Emulation To Disrupt Binary Code Injection Attacks , 2003 .

[30]  Zhen Liu,et al.  Dynamic learning of automata from the call stack log for anomaly detection , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[31]  Karl N. Levitt,et al.  Automated detection of vulnerabilities in privileged programs by execution monitoring , 1994, Tenth Annual Computer Security Applications Conference.

[32]  David A. Wagner,et al.  A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities , 2000, NDSS.

[33]  Pau-Chen Cheng,et al.  BlueBoX: A policy-driven, host-based intrusion detection system , 2003, TSEC.

[34]  Ravishankar K. Iyer,et al.  Transparent runtime randomization for security , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[35]  Crispin Cowan,et al.  FormatGuard: Automatic Protection From printf Format String Vulnerabilities , 2001, USENIX Security Symposium.

[36]  Debin Gao,et al.  Gray-box extraction of execution graphs for anomaly detection , 2004, CCS '04.

[37]  T. Mitchem,et al.  Using kernel hypervisors to secure applications , 1997, Proceedings 13th Annual Computer Security Applications Conference.

[38]  Calvin Ko,et al.  Logic induction of valid behavior specifications for intrusion detection , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[39]  Susan M. Bridges,et al.  Incremental learning of discrete hidden markov models , 2005 .

[40]  Zhenkai Liang,et al.  Isolated program execution: an application transparent approach for executing untrusted programs , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[41]  David Evans,et al.  Statically Detecting Likely Buffer Overflow Vulnerabilities , 2001, USENIX Security Symposium.

[42]  Somesh Jha,et al.  Detecting Manipulated Remote Call Streams , 2002, USENIX Security Symposium.

[43]  Zhen Liu,et al.  A comparison of input representations in neural networks: a case study in intrusion detection , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[44]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[45]  Somesh Jha,et al.  Buffer overrun detection using linear programming and static analysis , 2003, CCS '03.

[46]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[47]  Crispin Cowan,et al.  Linux security modules: general security support for the linux kernel , 2002, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[48]  Massimo Bernaschi,et al.  Remus: a security-enhanced operating system , 2002, TSEC.

[49]  Derek Bruening,et al.  Secure Execution via Program Shepherding , 2002, USENIX Security Symposium.

[50]  Somesh Jha,et al.  Formalizing sensitivity in static analysis for intrusion detection , 2004, IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004.

[51]  David A. Wagner,et al.  This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Detecting Format String Vulnerabilities with Type Qualifiers , 2001 .

[52]  Jan Vitek,et al.  Efficient intrusion detection using automaton inlining , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[53]  Anup Ghosh,et al.  Simple, state-based approaches to program-based anomaly detection , 2002, TSEC.

[54]  Christopher Krügel,et al.  On the Detection of Anomalous System Call Arguments , 2003, ESORICS.

[55]  Navjot Singh,et al.  Transparent Run-Time Defense Against Stack-Smashing Attacks , 2000, USENIX Annual Technical Conference, General Track.

[56]  David A. Wagner,et al.  MOPS: an infrastructure for examining security properties of software , 2002, CCS '02.

[57]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1986, 1986 IEEE Symposium on Security and Privacy.

[58]  Gregory R. Andrews,et al.  Disassembly of executable code revisited , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[59]  John Johansen,et al.  PointGuard™: Protecting Pointers from Buffer Overflow Vulnerabilities , 2003, USENIX Security Symposium.

[60]  Andrew W. Appel,et al.  Using memory errors to attack a virtual machine , 2003, 2003 Symposium on Security and Privacy, 2003..

[61]  Saumya K. Debray,et al.  Obfuscation of executable code to improve resistance to static disassembly , 2003, CCS '03.

[62]  Tadayoshi Kohno,et al.  Token-based scanning of source code for security problems , 2002, TSEC.

[63]  Michael Rodeh,et al.  CSSV: towards a realistic tool for statically detecting all buffer overflows in C , 2003, PLDI '03.

[64]  Marc Dacier,et al.  Intrusion Detection Using Variable-Length Audit Trail Patterns , 2000, Recent Advances in Intrusion Detection.

[65]  Salvatore J. Stolfo,et al.  Modeling system calls for intrusion detection with dynamic window sizes , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[66]  Daniel C. DuVarney,et al.  SELF: a transparent security extension for ELF binaries , 2003, NSPW '03.