Hardening Malware Detection Systems Against Cyber Maneuvers: An Adversarial Machine Learning Approach

The evolution of mobile malware poses a serious threat to smartphone security. Today, sophisticated attackers can adapt by maximally sabotaging machine-learning classifiers via polluting training data, rendering most recent machine learning-based malware detection tools (such as Drebin and DroidAPIMiner) ineffective. In this paper, we explore the feasibility of constructing crafted malware samples; examine how machine-learning classifiers can be misled under three different threat models; then conclude that injecting carefully crafted data into training data can significantly reduce detection accuracy. To tackle the problem, we propose KuafuDet, a two-phase learning enhancing approach that learns mobile malware by adversarial detection. KuafuDet includes an offline training phase that selects and extracts features from the training set, and an online detection phase that utilizes the classifier trained by the first phase. To further address the adversarial environment, these two phases are intertwined through a self-adaptive learning scheme, wherein an automated camouflage detector is introduced to filter the suspicious false negatives and feed them back into the training phase. We finally show KuafuDet significantly reduces false negatives and boosts the detection accuracy by at least 15%. Experiments on more than 250,000 mobile applications demonstrate that KuafuDet is scalable and can be highly effective as a standalone system.

[1]  Leyla Bilge,et al.  Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence , 2015, USENIX Security Symposium.

[2]  Patrick P. K. Chan,et al.  Adversarial Feature Selection Against Evasion Attacks , 2016, IEEE Transactions on Cybernetics.

[3]  Lingling Fan,et al.  POSTER: Accuracy vs. Time Cost: Detecting Android Malware through Pareto Ensemble Pruning , 2016, CCS.

[4]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[5]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Beilun Wang,et al.  A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples , 2016, ICLR 2017.

[7]  Zhenkai Liang,et al.  AirBag: Boosting Smartphone Resistance to Malware Infection , 2014, NDSS.

[8]  Aristide Fattori,et al.  CopperDroid: Automatic Reconstruction of Android Malware Behaviors , 2015, NDSS.

[9]  Prateek Saxena,et al.  Auror: defending against poisoning attacks in collaborative deep learning systems , 2016, ACSAC.

[10]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[11]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[12]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[13]  Tao Xie,et al.  AppContext: Differentiating Malicious and Benign Mobile App Behaviors Using Context , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[15]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Eric Bodden,et al.  Harvesting Runtime Values in Android Applications That Feature Anti-Analysis Techniques , 2016, NDSS.

[17]  Jacques Klein,et al.  Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis , 2016, POPL.

[18]  Minhui Xue,et al.  Towards adversarial detection of mobile malware: poster , 2016, MobiCom.

[19]  Yajin Zhou,et al.  Fast, scalable detection of "Piggybacked" mobile applications , 2013, CODASPY.

[20]  Ali Feizollah,et al.  AndroDialysis: Analysis of Android Intent Effectiveness in Malware Detection , 2017, Comput. Secur..

[21]  David Lie,et al.  IntelliDroid: A Targeted Input Generator for the Dynamic Analysis of Android Malware , 2016, NDSS.

[22]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[23]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[24]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[25]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[26]  Harry Wechsler,et al.  Adversarial Spam Detection Using the Randomized Hough Transform-Support Vector Machine , 2013, 2013 12th International Conference on Machine Learning and Applications.

[27]  Yang Liu,et al.  Mystique: Evolving Android Malware for Auditing Anti-Malware Tools , 2016, AsiaCCS.

[28]  Patrick D. McDaniel,et al.  Adversarial Perturbations Against Deep Neural Networks for Malware Classification , 2016, ArXiv.

[29]  Xuxian Jiang,et al.  Catch Me If You Can: Evaluating Android Anti-Malware Against Transformation Attacks , 2014, IEEE Transactions on Information Forensics and Security.

[30]  Yajin Zhou,et al.  Hey, You, Get Off of My Market: Detecting Malicious Apps in Official and Alternative Android Markets , 2012, NDSS.

[31]  F. Brunk INTERSECTION PROBLEMS IN COMBINATORICS , 2009 .

[32]  Somesh Jha,et al.  Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[33]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[34]  Muttukrishnan Rajarajan,et al.  PIndroid: A novel Android malware detection system using ensemble learning , 2017 .

[35]  Minhui Xue,et al.  StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware , 2016, AsiaCCS.

[36]  Patrick D. McDaniel,et al.  Machine Learning in Adversarial Settings , 2016, IEEE Security & Privacy.

[37]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[38]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[39]  Wei Chen,et al.  More Semantics More Robust: Improving Android Malware Classifiers , 2016, WISEC.

[40]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[41]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[42]  Junfeng Yang,et al.  Towards Making Systems Forget with Machine Unlearning , 2015, 2015 IEEE Symposium on Security and Privacy.

[43]  Chao Yang,et al.  DroidMiner: Automated Mining and Characterization of Fine-grained Malicious Behaviors in Android Applications , 2014, ESORICS.

[44]  Christopher Krügel,et al.  Execute This! Analyzing Unsafe and Malicious Dynamic Code Loading in Android Applications , 2014, NDSS.

[45]  Wei Liu,et al.  On Sparse Feature Attacks in Adversarial Learning , 2014, ICDM.

[46]  Apu Kapadia,et al.  Soundcomber: A Stealthy and Context-Aware Sound Trojan for Smartphones , 2011, NDSS.

[47]  Eric Bodden,et al.  A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks , 2014, NDSS.

[48]  Mansour Ahmadi,et al.  DroidScribe: Classifying Android Malware Based on Runtime Behavior , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[49]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[50]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[51]  Tobias Scheffer,et al.  Static prediction games for adversarial learning problems , 2012, J. Mach. Learn. Res..

[52]  Vladimir Vovk,et al.  Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection , 2016, AISec@CCS.