MalProfiler: Automatic and Effective Classification of Android Malicious Apps in Behavioral Classes

Android malicious apps are currently the main security threat for mobile devices. Due to their exponential growth in number of samples, it is vital to timely recognize and classify any new threat, to identify and effectively apply specific countermeasures. In this paper we propose MalProfiler, a framework which performs fast and effective analysis of Android malicious apps, based on the analysis of a set of static app features. The proposed approach exploits an algorithm named Categorical Clustering Tree (CCTree), which can be used both as a divisive clustering algorithm, or as a trainable classifier for supervised learning classification. Hence, the CCTree has been exploited to perform both homogeneous clustering, grouping similar malicious apps for simplified analysis, and to classify them in predefined behavioral classes. The approach has been tested on a set of 3500 real malicious apps belonging to more than 200 families, showing both an high clustering capability, measured through internal and external evaluation, together with an accuracy of 97% in classifying malicious apps according to their behavior.

[1]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[2]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[3]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[4]  Gianluca Dini,et al.  Evaluating the Trust of Android Applications through an Adaptive and Distributed Multi-criteria Approach , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[5]  Nadia Tawbi,et al.  Clustering spam emails into campaigns , 2015, 2015 International Conference on Information Systems Security and Privacy (ICISSP).

[6]  Gianluca Dini,et al.  MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention , 2018, IEEE Transactions on Dependable and Secure Computing.

[7]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[8]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[9]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[10]  Steve Hanna,et al.  Android permissions demystified , 2011, CCS '11.

[11]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[12]  Gianluca Dini,et al.  MADAM: A Multi-level Anomaly Detector for Android Malware , 2012, MMM-ACNS.

[13]  Francisco Herrera,et al.  A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14]  Fabio Martinelli,et al.  Digital Waste Sorting: A Goal-Based, Self-Learning Approach to Label Spam Email Campaigns , 2015, STM.

[15]  Eric Medvet,et al.  Effectiveness of Opcode ngrams for Detection of Multi Family Android Malware , 2015, 2015 10th International Conference on Availability, Reliability and Security.

[16]  Fabio Martinelli,et al.  Fast and Effective Clustering of Spam Emails Based on Structural Similarity , 2015, FPS.

[17]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[18]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.