Learning Program Behavior for Run-Time Software Assurance

In this paper we present techniques for machine learning of program behavior by observing application level events to support runtime anomaly detection. We exploit two key relationships among event sequences: their edit distance proximity and state information embedded in event data. We integrate two techniques that employ these relationships to reduce both false positives and false negatives. Our techniques consider event sequences in their entirety, and thus better leverage correlations among events over longer time periods than most other techniques that use small, fixed length sliding windows over such sequences. We employ cluster signatures that minimize adverse effects of noise in anomaly detection, thereby further reducing false positives. We leverage state information in event data to summarize loop structures in sequences which, in turn, leads to better classification of program behavior. We have performed initial validations of these techniques using Asterisk®, a widely deployed, open source digital PBX.

[1]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[2]  R. Sekar,et al.  Dataflow anomaly detection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[3]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[4]  Computer Network Security , 2005 .

[5]  Kyubum Wee,et al.  Automatic Generation of Finite State Automata for Detecting Intrusions Using System Call Sequences , 2003, MMM-ACNS.

[6]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[7]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[8]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[9]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[10]  Salvatore J. Stolfo,et al.  Learning Rules from System Call Arguments and Sequences for Anomaly 20 Detection , 2003 .

[11]  Calvin Ko,et al.  Logic induction of valid behavior specifications for intrusion detection , 2000, Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000.

[12]  Ulf Lindqvist,et al.  eXpert-BSM: a host-based intrusion detection solution for Sun Solaris , 2001, Seventeenth Annual Computer Security Applications Conference.