Detection and robustness evaluation of android malware classifiers

Android malware attacks are tremendously increasing, and evasion techniques become more and more effective. For this reason, it is necessary to continuously improve the detection performances. With this paper, we wish to pursue this purpose with two contributions. On one hand, we aim at evaluating how improving machine learning-based malware detectors, and on the other hand, we investigate to which extent adversarial attacks can deteriorate the performances of the classifiers. Analysis of malware samples is performed using static and dynamic analysis. This paper proposes a framework for integrating both static and dynamic features trained on machine learning methods and deep neural network. On employing machine learning algorithms, we obtain an accuracy of 97.59% with static features using SVM, and 95.64% is reached with dynamic features using Random forest. Additionally, a 100% accuracy was obtained with CART and SVM using hybrid attributes (on combining relevant static and dynamic features). Further, using deep neural network models, experimental results showed an accuracy of 99.28% using static features, 94.61% using dynamic attributes, and 99.59% by combining both static and dynamic features (also known as multi-modal attributes). Besides, we evaluated the robustness of classifiers against evasion and poisoning attack. In particular comprehensive analysis was performed using permission, APIs, app components and system calls (especially n-grams of system calls). We noticed that the performances of the classifiers significantly dropped while simulating evasion attack using static features, and in some cases 100% of adversarial examples were wrongly labelled by the classification models. Additionally, we show that models trained using dynamic features are also vulnerable to attack, however they exhibit more resilience than a classifier built on static features.

[1]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2017, Pattern Recognit..

[2]  Rui Zhang,et al.  Malware identification using visualization images and deep learning , 2018, Comput. Secur..

[3]  Shu-Tao Xia,et al.  Back-propagation neural network on Markov chains from system call sequences: a new approach for detecting Android malware with system call sequences , 2017, IET Inf. Secur..

[4]  Simone Atzeni,et al.  Evaluation of Android Malware Detection Based on System Calls , 2016, IWSPA@CODASPY.

[5]  Aristide Fattori,et al.  CopperDroid: Automatic Reconstruction of Android Malware Behaviors , 2015, NDSS.

[6]  Eul Gyu Im,et al.  A Multimodal Deep Learning Method for Android Malware Detection Using Various Features , 2019, IEEE Transactions on Information Forensics and Security.

[7]  Roberto Baldoni,et al.  Survey on the Usage of Machine Learning Techniques for Malware Analysis , 2017, Comput. Secur..

[8]  Mark Stamp,et al.  A Comparative Analysis of Android Malware , 2019, ICISSP.

[9]  Shouhuai Xu,et al.  DroidEye: Fortifying Security of Learning-Based Classifier Against Adversarial Android Malware Attacks , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[10]  Fabio Martinelli,et al.  Dynamic malware detection and phylogeny analysis using process mining , 2018, International Journal of Information Security.

[11]  Konstantin Berlin,et al.  Deep neural network based malware detection using two dimensional binary program features , 2015, 2015 10th International Conference on Malicious and Unwanted Software (MALWARE).

[12]  Xiaolei Wang,et al.  A Novel Hybrid Mobile Malware Detection System Integrating Anomaly Detection With Misuse Detection , 2015, MCS '15.

[13]  Divya Bansal,et al.  Malware Analysis and Classification: A Survey , 2014 .

[14]  Abdelouahid Derhab,et al.  MalDozer: Automatic framework for android malware detection using deep learning , 2018, Digit. Investig..

[15]  Witawas Srisa-an,et al.  Significant Permission Identification for Machine-Learning-Based Android Malware Detection , 2018, IEEE Transactions on Industrial Informatics.

[16]  Kanubhai K. Patel,et al.  Detection and Mitigation of Android Malware Through Hybrid Approach , 2015, SSCC.

[17]  Fabio Martinelli,et al.  Evaluating Convolutional Neural Network for Effective Mobile Malware Detection , 2017, KES.

[18]  Abien Fred Agarap,et al.  Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine (SVM) for Malware Classification , 2017, ArXiv.

[19]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[20]  Brian Mac Namee,et al.  Deep learning at the shallow end: Malware classification for non-domain experts , 2018, Digit. Investig..

[21]  Hongnian Yu,et al.  SAMADroid: A Novel 3-Level Hybrid Malware Detection Model for Android Operating System , 2018, IEEE Access.

[22]  Shiva Darshan S.L,et al.  Windows Malware Detector Using Convolutional Neural Network Based on Visualization Images , 2019, IEEE Transactions on Emerging Topics in Computing.

[23]  MI ZHOU,et al.  A HYBRID FEATURE SELECTION METHOD BASED ON FISHER SCORE AND GENETIC ALGORITHM , 2016 .

[24]  Adam Doupé,et al.  Deep Android Malware Detection , 2017, CODASPY.

[25]  Sevil Sen,et al.  Coevolution of Mobile Malware and Anti-Malware , 2018, IEEE Transactions on Information Forensics and Security.

[26]  Bo Li,et al.  Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach , 2017, Comput. Secur..

[27]  Lorenzo Cavallaro,et al.  Intriguing Properties of Adversarial ML Attacks in the Problem Space , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[28]  Sabu Emmanuel,et al.  GSDroid: Graph Signal Based Compact Feature Representation for Android Malware Detection , 2020, Expert Syst. Appl..

[29]  Luca Verderame,et al.  Obfuscapk: An open-source black-box obfuscation tool for Android apps , 2020, SoftwareX.

[30]  Patrick D. McDaniel,et al.  Adversarial Examples for Malware Detection , 2017, ESORICS.

[31]  Vitor Monte Afonso,et al.  Identifying Android malware using dynamically obtained features , 2014, Journal of Computer Virology and Hacking Techniques.

[32]  Jun Sun,et al.  Auditing Anti-Malware Tools by Evolving Android Malware and Dynamic Loading Technique , 2017, IEEE Transactions on Information Forensics and Security.

[33]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[34]  Yoseba K. Penya,et al.  N-grams-based File Signatures for Malware Detection , 2009, ICEIS.

[35]  Mark Stamp,et al.  A comparison of static, dynamic, and hybrid analysis for malware detection , 2015, Journal of Computer Virology and Hacking Techniques.

[36]  Yanjun Qi,et al.  Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers , 2016, NDSS.

[37]  Fabio Roli,et al.  Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware Detection , 2017, IEEE Transactions on Dependable and Secure Computing.

[38]  Gianluca Dini,et al.  MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention , 2018, IEEE Transactions on Dependable and Secure Computing.

[39]  Quan Qian,et al.  Deep Learning and Visualization for Identifying Malware Families , 2018, IEEE Transactions on Dependable and Secure Computing.

[40]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[41]  Vlado Keselj,et al.  N-gram-based detection of new malicious code , 2004, Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004..

[42]  R. Vinayakumar,et al.  DeepMalNet: Evaluating shallow and deep networks for static PE malware detection , 2018, ICT Express.

[43]  Shih-Hao Hung,et al.  DroidDolphin: a dynamic Android malware detection framework using big data and machine learning , 2014, RACS '14.

[44]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Sheng-De Wang,et al.  Machine Learning Based Hybrid Behavior Models for Android Malware Analysis , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[46]  Samuel Greengard,et al.  Cybersecurity gets smart , 2016, Commun. ACM.

[47]  Xi Xiao,et al.  CSCdroid: Accurately Detect Android Malware via Contribution-Level-Based System Call Categorization , 2017, 2017 IEEE Trustcom/BigDataSE/ICESS.

[48]  Lorenzo Cavallaro,et al.  TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time , 2018, USENIX Security Symposium.

[49]  Madhumita Chatterjee,et al.  A Novel Approach to Detect Android Malware , 2015 .