Process Mining Meets Malware Evolution: A Study of the Behavior of Malicious Code

Mobile phones are more and more used for sensitive resources exchange and access, becoming target for possible malware attacks. These attacks are still increasing with the birth of new and sophisticated malware that make the existing malware detection approaches often inadequate. Since the majority of new malware are generated using existing malicious code, it becomes very important tracking the mobile malware phylogeny. In this work, a Process Mining (PM) approach for building a malware phylogeny model using information contained in system calls traces, is proposed. The adoption of a declarative Process Mining technique allows to mine a constraint-based model that can be effectively used as a malware fingerprint expressing relationships and recurring execution patterns among system calls in the execution flows. The model characterizes the behavior of malware applications allowing the identification of similarities across malware families and among malware variants belonging to the same family. The proposed approach is evaluated using a dataset of more than 700 infected applications across seven malware families obtaining very encouraging results.

[1]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[2]  Mario Luca Bernardi,et al.  A constraint-driven approach for dynamic malware detection , 2016, 2016 14th Annual Conference on Privacy, Security and Trust (PST).

[3]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[4]  Andrew Walenstein,et al.  A transformation-based model of malware derivation , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[5]  Stanley J. Barr,et al.  A boosting ensemble for the recognition of code sharing in malware , 2008, Journal in Computer Virology.

[6]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[7]  Dan Arp,et al.  Drebin : � Efficient and Explainable Detection of Android Malware in Your Pocket , 2014 .

[8]  Helen J. Wang,et al.  Finding diversity in remote code injection exploits , 2006, IMC '06.

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  Enrique V. Carrera,et al.  Digital genome mapping: ad-vanced binary malware analysis , 2004 .

[11]  Fabrizio Maria Maggi,et al.  Using Discriminative Rule Mining to Discover Declarative Process Models with Non-atomic Activities , 2014, RuleML.

[12]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[13]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[14]  Wil M. P. van der Aalst,et al.  DECLARE: Full Support for Loosely-Structured Processes , 2007, 11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007).

[15]  Fabrizio Maria Maggi,et al.  M3D: a tool for the model driven development of web applications , 2012, WIDM '12.

[16]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[17]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[18]  Andrew Walenstein,et al.  Evaluation of malware phylogeny modelling systems using automated variant generation , 2009, Journal in Computer Virology.

[19]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..