Mining specifications of malicious behavior

Malware detectors require a specification of maliciousbehavior. Typically, these specifications are manually constructedby investigating known malware. We present an automatic technique to overcome this laborious manual process. Our technique derives such a specification by comparing the execution behavior of a known malware against the execution behaviors of a set of benign programs. In other words, we mine the malicious behavior present in a known malware that is not present in a set of benign programs. The output of our algorithm can be used by malware detectors to detect malware variants. Since our algorithm provides a succinct description of malicious behavior present in a malware, it can also be used by security analysts for understanding the malware. We have implemented a prototype based on our algorithm and tested it on several malware programs. Experimental results obtained from our prototype indicate that our algorithm is effective in extracting malicious behaviors that can be used to detect malware variants

[1]  Susan Horwitz,et al.  Identifying the semantic and textual differences between two versions of a program , 1990, PLDI '90.

[2]  Xiangyu Zhang,et al.  Matching execution histories of program versions , 2005, ESEC/FSE-13.

[3]  Stefan Katzenbeisser,et al.  Detecting Malicious Code by Model Checking , 2005, DIMVA.

[4]  Peter Szor,et al.  HUNTING FOR METAMORPHIC , 2001 .

[5]  Christopher Krügel,et al.  On the Detection of Anomalous System Call Arguments , 2003, ESORICS.

[6]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.

[7]  James Bailey,et al.  Mining Minimal Contrast Subgraph Patterns , 2006, SDM.

[8]  Daniel Jackson,et al.  Semantic Diff: a tool for summarizing the effects of modifications , 1994, Proceedings 1994 International Conference on Software Maintenance.

[9]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[10]  Giovanni Vigna,et al.  Reverse Engineering of Network Signatures , 2005 .

[11]  J. Laski,et al.  Identification of program modifications and its applications in software maintenance , 1992, Proceedings Conference on Software Maintenance 1992.

[12]  R. Sekar,et al.  Dataflow anomaly detection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[13]  Somesh Jha,et al.  Efficient Context-Sensitive Intrusion Detection , 2004, NDSS.

[14]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[15]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[16]  Alessandro Orso,et al.  A differencing algorithm for object-oriented programs , 2004, Proceedings. 19th International Conference on Automated Software Engineering, 2004..

[17]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[18]  Carey Nachenberg,et al.  Computer virus-antivirus coevolution , 1997, Commun. ACM.

[19]  Christopher Krügel,et al.  Detecting kernel-level rootkits through binary analysis , 2004, 20th Annual Computer Security Applications Conference.

[20]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[21]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[22]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[23]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[24]  Stan Jarzabek,et al.  Detecting higher-level similarity patterns in programs , 2005, ESEC/FSE-13.

[25]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).