Model checking for malicious family detection and phylogenetic analysis in mobile environment

Abstract Malware targeting mobile devices is widespread, in fact considering the great amount of sensitive and private information stored in tablets and smartphones they represent an interesting surface attack for malware developers. From the defensive side, the well-known weaknesses of the current anti-malware technologies do not permit only the detection of new obfuscated malicious payloads, but also of obfuscated malware (even with trivial obfuscation techniques applied with automatic morphing engines). In fact, a threat is recognized only if its signature is present in the anti-malware repository and typically the signature extraction consists in a time consuming task performed by security analysts. In this paper we propose a two-fold method aimed to (i) detect the belonging family of a mobile malicious application and (ii) collocate the application in the right position in the phylogenetic tree. We represent application system call traces in terms of automaton and, through the adoption of process mining, we extract temporal logic property verified with the adoption of a formal verification environment. The evaluation on a data-set composed by more than 12,000 Android applications (4552 malicious ranging from 2010 to 2018, 4552 obfuscated with three different obfuscation engines and 3500 legitimate) confirms the effectiveness of the proposed formal methods-based approach, obtaining an accuracy ranging from 0.882 to 0.987 in the analysis of 12 real-world widespread malicious families implementing different behaviours.

[1]  Aniello Cimitile,et al.  Model checking for mobile Android malware evolution , 2017 .

[2]  Ashu Sharma,et al.  A Survey on the Detection of Android Malicious Apps , 2019, Advances in Intelligent Systems and Computing.

[3]  Antonella Santone,et al.  Conformance Checking using Formal Methods , 2016, ICSOFT-EA.

[4]  Barton P. Miller,et al.  Recovering the toolchain provenance of binary code , 2011, ISSTA '11.

[5]  Steven Jilcott Scalable malware forensics using phylogenetic analysis , 2015, 2015 IEEE International Symposium on Technologies for Homeland Security (HST).

[6]  Somesh Jha,et al.  Malware Lineage in the Wild , 2017, Comput. Secur..

[7]  Mohsen Guizani,et al.  Discovering Communities of Malapps on Android-based Mobile Cyber-physical Systems , 2018, Ad Hoc Networks.

[8]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[9]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[10]  Jacques Klein,et al.  On Locating Malicious Code in Piggybacked Android Apps , 2017, Journal of Computer Science and Technology.

[11]  Sencun Zhu,et al.  Privacy Risk Analysis and Mitigation of Analytics Libraries in the Android Ecosystem , 2020, IEEE Transactions on Mobile Computing.

[12]  Sencun Zhu,et al.  Detecting Software Theft via System Call Based Birthmarks , 2009, 2009 Annual Computer Security Applications Conference.

[13]  Gerardo Canfora,et al.  Obfuscation Techniques against Signature-Based Detection: A Case Study , 2015, 2015 Mobile Systems Technologies Workshop (MST).

[14]  Xuxian Jiang,et al.  DroidChameleon: evaluating Android anti-malware against transformation attacks , 2013, ASIA CCS '13.

[15]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[16]  Christian S. Collberg,et al.  Surreptitious Software - Obfuscation, Watermarking, and Tamperproofing for Software Protection , 2009, Addison-Wesley Software Security Series.

[17]  Gerardo Canfora,et al.  LEILA: Formal Tool for Identifying Mobile Malicious Behaviour , 2019, IEEE Transactions on Software Engineering.

[18]  Gerardo Canfora,et al.  A Classifier of Malicious Android Applications , 2013, 2013 International Conference on Availability, Reliability and Security.

[19]  Tianqing Zhu,et al.  SaaS: A situational awareness and analysis system for massive android malware detection , 2019, Future Gener. Comput. Syst..

[20]  Christian W. Günther,et al.  Disco: Discover Your Processes , 2012, BPM.

[21]  Aniello Cimitile,et al.  Mobile Malware Detection in the Real World , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[22]  Seong-je Cho,et al.  A kernel-based monitoring approach for analyzing malicious behavior on Android , 2014, SAC.

[23]  Xuxian Jiang,et al.  Catch Me If You Can: Evaluating Android Anti-Malware Against Transformation Attacks , 2014, IEEE Transactions on Information Forensics and Security.

[24]  Rance Cleaveland,et al.  The NCSU Concurrency Workbench , 1996, CAV.

[25]  Kangbin Yim,et al.  Malware Obfuscation Techniques: A Brief Survey , 2010, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications.

[26]  Yajin Zhou,et al.  Android Malware , 2013, SpringerBriefs in Computer Science.

[27]  David Brumley,et al.  BitShred: feature hashing malware for scalable triage and semantic analysis , 2011, CCS '11.

[28]  Xiangliang Zhang,et al.  Detecting Android malicious apps and categorizing benign apps with ensemble of classifiers , 2018, Future Gener. Comput. Syst..

[29]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[30]  Colin Stirling,et al.  An Introduction to Modal and Temporal Logics for CCS , 1991, Concurrency: Theory, Language, And Architecture.

[31]  Juan E. Tapiador,et al.  Evolution, Detection and Analysis of Malware for Smart Devices , 2014, IEEE Communications Surveys & Tutorials.

[32]  Dexter Kozen,et al.  RESULTS ON THE PROPOSITIONAL’p-CALCULUS , 2001 .

[33]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[34]  Andrew Walenstein,et al.  A transformation-based model of malware derivation , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[35]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[36]  Tudor Dumitras,et al.  Experimental Challenges in Cyber Security: A Story of Provenance and Lineage for Malware , 2011, CSET.

[37]  Antonella Santone,et al.  Download malware? no, thanks: how formal methods can block update attacks , 2016, FM 2016.

[38]  Robin Milner,et al.  Communication and concurrency , 1989, PHI Series in computer science.

[39]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[40]  Wil M. P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[41]  Gerardo Canfora,et al.  Mobile malware detection using op-code frequency histograms , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[42]  John C. S. Lui,et al.  ADAM: An Automatic and Extensible Platform to Stress Test Android Anti-virus Systems , 2012, DIMVA.

[43]  Eric Medvet,et al.  Detecting Android malware using sequences of system calls , 2015, DeMobile@SIGSOFT FSE.

[44]  Arun Kumar Sangaiah,et al.  Android malware detection based on system call sequences and LSTM , 2019, Multimedia Tools and Applications.

[45]  Xu Chen,et al.  Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[46]  Henry Leung,et al.  Adversarial-Example Attacks Toward Android Malware Detection System , 2020, IEEE Systems Journal.