DroidCat: Effective Android Malware Detection and Categorization via App-Level Profiling

Most existing Android malware detection and categorization techniques are static approaches, which suffer from evasion attacks, such as obfuscation. By analyzing program behaviors, dynamic approaches are potentially more resilient against these attacks. Yet existing dynamic approaches mostly rely on characterizing system calls which are subject to system-call obfuscation. This paper presents DroidCat, a novel dynamic app classification technique, to complement existing approaches. By using a diverse set of dynamic features based on method calls and inter-component communication (ICC) Intents without involving permission, app resources, or system calls while fully handling reflection, DroidCat achieves superior robustness than static approaches as well as dynamic approaches relying on system calls. The features were distilled from a behavioral characterization study of benign versus malicious apps. Through three complementary evaluation studies with 34 343 apps from various sources and spanning the past nine years, we demonstrated the stability of DroidCat in achieving high classification performance and superior accuracy compared with the two state-of-the-art peer techniques that represent both static and dynamic approaches. Overall, DroidCat achieved 97% F1-measure accuracy consistently for classifying apps evolving over the nine years, detecting or categorizing malware, 16%–27% higher than any of the two baselines compared. Furthermore, our experiments with obfuscated benchmarks confirmed higher robustness of DroidCat over these baseline techniques. We also investigated the effects of various design decisions on DroidCat’s effectiveness and the most important features for our dynamic classification. We found that features capturing app execution structure such as the distribution of method calls over user code and libraries are much more important than typical security features such as sensitive flows.

[1]  Guofei Gu,et al.  Shadow attacks: automatically evading system-call-behavior based malware detection , 2011, Journal in Computer Virology.

[2]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[3]  Jacques Klein,et al.  Effective inter-component communication mapping in Android with Epicc: an essential step towards holistic security analysis , 2013 .

[4]  Glenn Fung,et al.  On the Dangers of Cross-Validation. An Experimental Evaluation , 2008, SDM.

[5]  Yajin Zhou,et al.  RiskRanker: scalable and accurate zero-day android malware detection , 2012, MobiSys '12.

[6]  Karim O. Elish,et al.  High Precision Screening for Android Malware with Dimensionality Reduction , 2014, 2014 13th International Conference on Machine Learning and Applications.

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[9]  Wenke Lee,et al.  Checking More and Alerting Less: Detecting Privacy Leakages via Enhanced Data-flow Analysis and Peer Voting , 2015, NDSS.

[10]  Stephanie Forrest,et al.  The Evolution of System-Call Monitoring , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[11]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[12]  Petar Tsankov,et al.  Statistical Deobfuscation of Android Applications , 2016, CCS.

[13]  Aristide Fattori,et al.  CopperDroid: Automatic Reconstruction of Android Malware Behaviors , 2015, NDSS.

[14]  Chao Yang,et al.  DroidMiner: Automated Mining and Characterization of Fine-grained Malicious Behaviors in Android Applications , 2014, ESORICS.

[15]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[16]  Laurie Hendren,et al.  Soot: a Java bytecode optimization framework , 2010, CASCON.

[17]  Gianluca Stringhini,et al.  A Family of Droids-Android Malware Detection via Behavioral Modeling: Static vs Dynamic Analysis , 2018, 2018 16th Annual Conference on Privacy, Security and Trust (PST).

[18]  Giorgio Giacinto,et al.  Stealth attacks: An extended insight into the obfuscation effects on Android malware , 2015, Comput. Secur..

[19]  Xudong Ma,et al.  Dynamic Android Malware Classification Using Graph-Based Representations , 2016, 2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud).

[20]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[21]  David Grove,et al.  Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis , 1995, ECOOP.

[22]  Youssef B. Mahdy,et al.  Behavior-based features model for malware detection , 2016, Journal of Computer Virology and Hacking Techniques.

[23]  Raúl A. Santelices,et al.  Diver: precise dynamic impact analysis using dependence-based trace pruning , 2014, ASE.

[24]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[25]  Tao Xie,et al.  AppContext: Differentiating Malicious and Benign Mobile App Behaviors Using Context , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[26]  Christian Platzer,et al.  MARVIN: Efficient and Comprehensive Mobile App Classification through Static and Dynamic Analysis , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[27]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[28]  Mansour Ahmadi,et al.  DroidScribe: Classifying Android Malware Based on Runtime Behavior , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[29]  Peng Liu,et al.  Impeding behavior-based malware analysis via replacement attacks to malware specifications , 2017, Journal of Computer Virology and Hacking Techniques.

[30]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[31]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[32]  Patrick D. McDaniel,et al.  On lightweight mobile phone application certification , 2009, CCS.

[33]  Ninghui Li,et al.  Using probabilistic generative models for ranking risks of Android apps , 2012, CCS.

[34]  Mansour Ahmadi,et al.  DroidSieve: Fast and Accurate Classification of Obfuscated Android Malware , 2017, CODASPY.

[35]  Vitor Monte Afonso,et al.  Identifying Android malware using dynamically obtained features , 2014, Journal of Computer Virology and Hacking Techniques.

[36]  Ali Feizollah,et al.  The Evolution of Android Malware and Android Analysis Techniques , 2017, ACM Comput. Surv..

[37]  Haipeng Cai,et al.  Leveraging Historical Versions of Android Apps for Efficient and Precise Taint Analysis , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[38]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[39]  Nic Herndon,et al.  Experimental Study with Real-world Data for Android App Security Analysis using Machine Learning , 2015, ACSAC.

[40]  Yanick Fratantonio,et al.  ANDRUBIS -- 1,000,000 Apps Later: A View on Current Android Malware Behaviors , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[41]  Ke Xu,et al.  ICCDetector: ICC-Based Malware Detection on Android , 2016, IEEE Transactions on Information Forensics and Security.

[42]  Haipeng Cai,et al.  Understanding Android Application Programming and Security: A Dynamic Study , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[43]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[44]  Abhinav Srivastava,et al.  Operating System Interface Obfuscation and the Revealing of Hidden Operations , 2011, DIMVA.

[45]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[46]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[47]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[48]  Minhui Xue,et al.  StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware , 2016, AsiaCCS.

[49]  Haipeng Cai,et al.  DroidFax: A Toolkit for Systematic Characterization of Android Applications , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[50]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[51]  Dawn Xiaodong Song,et al.  Limits of Learning-based Signature Generation with Adversaries , 2008, NDSS.

[52]  Malinda Dilhara,et al.  Automated Detection and Repair of Incompatible Uses of Runtime Permissions in Android Apps , 2018, 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[53]  Isil Dillig,et al.  Apposcopy: semantics-based detection of Android malware through static analysis , 2014, SIGSOFT FSE.

[54]  Michael Pradel,et al.  Making Malory Behave Maliciously: Targeted Fuzzing of Android Execution Environments , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[55]  Eric Medvet,et al.  Acquiring and Analyzing App Metrics for Effective Mobile Malware Detection , 2016, IWSPA@CODASPY.

[56]  Gianluca Stringhini,et al.  A Family of Droids: Analyzing Behavioral Model based Android Malware Detection via Static and Dynamic Analysis , 2018, ArXiv.

[57]  Aziz Mohaisen,et al.  Detecting and Classifying Android Malware Using Static Analysis along with Creator Information , 2015, Int. J. Distributed Sens. Networks.

[58]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[59]  Heejo Lee,et al.  Detecting metamorphic malwares using code graphs , 2010, SAC '10.

[60]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[61]  Haipeng Cai,et al.  Dissecting Android Inter-component Communications via Interactive Visual Explorations , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[62]  Haipeng Cai,et al.  DroidCat: Unified Dynamic Detection of Android Malware , 2016 .

[63]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[64]  Matthew L. Dering,et al.  Composite Constant Propagation: Application to Android Inter-Component Communication Analysis , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[65]  Jacques Klein,et al.  Empirical assessment of machine learning-based malware detectors for Android , 2014, Empirical Software Engineering.

[66]  Haipeng Cai,et al.  ICC-Inspect: Supporting Runtime Inspection of Android Inter-Component Communications , 2018, 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[67]  Thomas G. Dietterich Overfitting and undercomputing in machine learning , 1995, CSUR.

[68]  Ninghui Li,et al.  Android permissions: a perspective combining risks and benefits , 2012, SACMAT '12.

[69]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[70]  Gianluca Dini,et al.  MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention , 2018, IEEE Transactions on Dependable and Secure Computing.

[71]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.