Dalvik Opcode Graph Based Android Malware Variants Detection Using Global Topology Features

Since Android has become the dominator of smartphone operating system market with a share of 86.8%, the number of Android malicious applications are increasing rapidly as well. Such a large volume of diversified malware variants has forced researchers to investigate new methods by using machine learning since it provides a powerful ability for variants detection. Since the static analysis of malware plays an important role in system security and the opcode has been shown as an effective representation of malware, some of them use the Dalvik opcodes as features of malware and adopt machine learning to detect Android malware. However, current opcode-based methods are also facing some problems, such as considering both of accuracy and time cost, selection of features, and the lack of understanding or description of the characteristics of malware. To overcome the existing challenges, we propose a novel method to build a graph of Dalvik opcode and analyze its global topology properties, which will first construct a weighted probability graph of operations, and then we use information entropy to prune this graph while retaining information as more as possible, the next we extract several global topology features of the graph to represent malware, finally search the similarities with these features between programs. These global topology features formulate the high-level characteristics of malware. Our approach provides a light weight framework to detect Android malware variants based on graph theory and information theory. Theoretical analysis and real-life experimental results show the effectiveness, efficiency, and robustness of our approach, which achieves high detection accuracy and cost little training and detection time.

[1]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[2]  Igor Santos,et al.  Using Dalvik Opcodes for Malware Detection on Android , 2015, HAIS.

[3]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[4]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[5]  Golden G. Richard,et al.  OpSeq: Android Malware Fingerprinting , 2015, PPREW@ACSAC.

[6]  Wanlei Zhou,et al.  Control Flow-Based Malware VariantDetection , 2014, IEEE Transactions on Dependable and Secure Computing.

[7]  Peng Wang,et al.  Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale , 2015, USENIX Security Symposium.

[8]  Qinghua Zheng,et al.  Android Malware Familial Classification and Representative Sample Selection via Frequent Subgraph Analysis , 2018, IEEE Transactions on Information Forensics and Security.

[9]  Igor Santos,et al.  Opcode sequences as representation of executables for data-mining-based unknown malware detection , 2013, Inf. Sci..

[10]  Christos Faloutsos,et al.  Large Scale Graph Mining and Inference for Malware Detection , 2011, SDM.

[11]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[12]  Zheng Qin,et al.  IRMD: Malware Variant Detection Using Opcode Image Recognition , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[13]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[14]  Sakir Sezer,et al.  N-opcode analysis for android malware classification and categorization , 2016, 2016 International Conference On Cyber Security And Protection Of Digital Services (Cyber Security).

[15]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[16]  Sencun Zhu,et al.  A Large-scale Study of Android Malware Development Phenomenon on Public Malware Submission and Scanning Platform , 2018 .

[17]  Lei Xue,et al.  Adaptive Unpacking of Android Apps , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[18]  Yong Qi,et al.  Detecting Malware with an Ensemble Method Based on Deep Neural Network , 2018, Secur. Commun. Networks.

[19]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[20]  Yanfang Ye,et al.  Analyzing File-to-File Relation Network in Malware Detection , 2015, WISE.

[21]  Duen Horng Chau,et al.  Guilt by association: large scale malware detection by mining file-relation graphs , 2014, KDD.

[22]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[23]  David Camacho,et al.  MOCDroid: multi-objective evolutionary classifier for Android malware detection , 2017, Soft Comput..

[24]  Roberto Baldoni,et al.  Android malware family classification based on resource consumption over time , 2017, 2017 12th International Conference on Malicious and Unwanted Software (MALWARE).

[25]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[26]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[27]  Adam Doupé,et al.  Deep Android Malware Detection , 2017, CODASPY.

[28]  Barbara G. Ryder,et al.  Detection of Repackaged Android Malware with Code-Heterogeneity Features , 2020, IEEE Transactions on Dependable and Secure Computing.

[29]  Anu Mary Chacko,et al.  Android malware detection a survey , 2017, 2017 IEEE International Conference on Circuits and Systems (ICCS).

[30]  Gianluca Stringhini,et al.  Marmite: Spreading Malicious File Reputation Through Download Graphs , 2017, ACSAC.