Android HIV: A Study of Repackaging Malware for Evading Machine-Learning Detection

Machine learning-based solutions have been successfully employed for the automatic detection of malware on Android. However, machine learning models lack robustness to adversarial examples, which are crafted by adding carefully chosen perturbations to the normal inputs. So far, the adversarial examples can only deceive detectors that rely on syntactic features (e.g., requested permissions, API calls, etc.), and the perturbations can only be implemented by simply modifying application’s manifest. While recent Android malware detectors rely more on semantic features from Dalvik bytecode rather than manifest, existing attacking/defending methods are no longer effective. In this paper, we introduce a new attacking method that generates adversarial examples of Android malware and evades being detected by the current models. To this end, we propose a method of applying optimal perturbations onto Android APK that can successfully deceive the machine learning detectors. We develop an automated tool to generate the adversarial examples without human intervention. In contrast to existing works, the adversarial examples crafted by our method can also deceive recent machine learning-based detectors that rely on semantic features such as control-flow-graph. The perturbations can also be implemented directly onto APK’s Dalvik bytecode rather than Android manifest to evade from recent detectors. We demonstrate our attack on two state-of-the-art Android malware detection schemes, MaMaDroid and Drebin. Our results show that the malware detection rates decreased from 96% to 0% in MaMaDroid, and from 97% to 0% in Drebin, with just a small number of codes to be inserted into the APK.

[1]  Fabio Roli,et al.  Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware Detection , 2017, IEEE Transactions on Dependable and Secure Computing.

[2]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[3]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[4]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[5]  Martín Abadi,et al.  Adversarial Patch , 2017, ArXiv.

[6]  Ming Fan,et al.  DAPASA: Detecting Android Piggybacked Apps Through Sensitive Subgraph Analysis , 2017, IEEE Transactions on Information Forensics and Security.

[7]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[8]  Lior Rokach,et al.  Generic Black-Box End-to-End Attack against RNNs and Other API Calls Based Malware Classifiers , 2017, ArXiv.

[9]  Ying Tan,et al.  Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN , 2017, DMBD.

[10]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[11]  Ananthram Swami,et al.  Crafting adversarial input sequences for recurrent neural networks , 2016, MILCOM 2016 - 2016 IEEE Military Communications Conference.

[12]  Qing Fu Hong Machine learning for malware detection , 2017 .

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Ke Xu,et al.  ICCDetector: ICC-Based Malware Detection on Android , 2016, IEEE Transactions on Information Forensics and Security.

[15]  Jiajun Lu,et al.  Adversarial Examples that Fool Detectors , 2017, ArXiv.

[16]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[17]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[18]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[19]  Yanfang Ye,et al.  SecureDroid: Enhancing Security of Machine Learning-based Detection against Adversarial Android Malware Attacks , 2017, ACSAC.

[20]  Xin Sun,et al.  Detection, Classification and Characterization of Android Malware Using API Data Dependency , 2015, SecureComm.

[21]  Carl A. Gunter,et al.  Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps , 2017, ACSAC.

[22]  Muttukrishnan Rajarajan,et al.  Android Security: A Survey of Issues, Malware Penetration, and Defenses , 2015, IEEE Communications Surveys & Tutorials.

[23]  Patrick D. McDaniel,et al.  Adversarial Examples for Malware Detection , 2017, ESORICS.

[24]  Yan Wang,et al.  Static Control-Flow Analysis of User-Driven Callbacks in Android Applications , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[25]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Yao Du,et al.  An Android Malware Detection Approach Using Community Structures of Weighted Function Call Graphs , 2017, IEEE Access.

[27]  Bo Li,et al.  Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach , 2017, Comput. Secur..

[28]  Lior Rokach,et al.  Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers , 2017, RAID.

[29]  Chao Yang,et al.  DroidMiner: Automated Mining and Characterization of Fine-grained Malicious Behaviors in Android Applications , 2014, ESORICS.

[30]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[31]  Aziz Mohaisen,et al.  Android Malware Detection Using Complex-Flows , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[32]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[33]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[34]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[35]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[36]  Daniele Sgandurra,et al.  A Survey on Security for Mobile Devices , 2013, IEEE Communications Surveys & Tutorials.

[37]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[38]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[39]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[41]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[42]  Xingquan Zhu,et al.  Machine Learning for Android Malware Detection Using Permission and API Calls , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[43]  Juan E. Tapiador,et al.  Evolution, Detection and Analysis of Malware for Smart Devices , 2014, IEEE Communications Surveys & Tutorials.

[44]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[45]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[46]  Win Zaw,et al.  Permission-Based Android Malware Detection , 2013 .