A Visualization-Based Analysis on Classifying Android Malware

Since the introduction of the Android mobile platform, the state of mobile malware has evolved in both attack sophistication and its ability to evade detection. Given the right combination of elements, the detection of malicious applications may be found among those that pose no threat, yet the threats that exist across these malware types reveal distinguishable attack characteristics. This paper investigates the benign and attacking characteristics. By plotting complex features into dendrograms, we propose a novel approach to visually distinguish Android apps. We visualize the complicated relationship and evaluate the effect of different text mining methods. Specifically, we employ machine learning techniques including feature reduction using Principle Component Analysis, and the Random Forest classifier, to compare eight different models. Using the Drebin dataset, we achieved an average accuracy of 95.83%.

[1]  Nicolas Christin,et al.  All Your Droid Are Belong to Us: A Survey of Current Android Attacks , 2011, WOOT.

[2]  Roland Eils,et al.  circlize implements and enhances circular visualization in R , 2014, Bioinform..

[3]  Rory Coulter,et al.  Intelligent agents defending for an IoT world: A review , 2018, Comput. Secur..

[4]  John Q. Gan,et al.  A new term weighting scheme based on class specific document frequency for document representation and classification , 2015, 2015 7th Computer Science and Electronic Engineering Conference (CEEC).

[5]  Jun Sun,et al.  Auditing Anti-Malware Tools by Evolving Android Malware and Dynamic Loading Technique , 2017, IEEE Transactions on Information Forensics and Security.

[6]  Fabio Martinelli,et al.  R-PackDroid: API package-based characterization and detection of mobile ransomware , 2017, SAC.

[7]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[8]  Yajin Zhou,et al.  RiskRanker: scalable and accurate zero-day android malware detection , 2012, MobiSys '12.

[9]  Paul Rimba,et al.  Data-Driven Cybersecurity Incident Prediction: A Survey , 2019, IEEE Communications Surveys & Tutorials.

[10]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[11]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[12]  Barath Narayanan Narayanan,et al.  Performance analysis of machine learning and pattern recognition algorithms for Malware classification , 2016, 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS).

[13]  Juan E. Tapiador,et al.  Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families , 2014, Expert Syst. Appl..

[14]  Majid Komeili,et al.  Local Feature Selection for Data Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Wolfgang Banzhaf,et al.  The use of computational intelligence in intrusion detection systems: A review , 2010, Appl. Soft Comput..

[16]  Leyla Bilge,et al.  Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence , 2015, USENIX Security Symposium.

[17]  Wenye Wang,et al.  Claim What You Need: A Text-Mining Approach on Android Permission Request Authorization , 2014, GLOBECOM 2014.

[18]  Tal Galili,et al.  dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering , 2015, Bioinform..

[19]  Baoli Li,et al.  Weighted Document Frequency for feature selection in text classification , 2015, 2015 International Conference on Asian Language Processing (IALP).

[20]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Zhenlong Yuan,et al.  DroidDetector: Android Malware Characterization and Detection Using Deep Learning , 2016 .

[22]  Yang Liu,et al.  Context-Aware, Adaptive, and Scalable Android Malware Detection Through Online Learning , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.

[23]  Arun Lakhotia,et al.  DroidLegacy: Automated Familial Classification of Android Malware , 2014, PPREW'14.

[24]  Tudor Dumitras,et al.  FeatureSmith: Automatically Engineering Features for Malware Detection by Mining the Security Literature , 2016, CCS.

[25]  Isil Dillig,et al.  Apposcopy: semantics-based detection of Android malware through static analysis , 2014, SIGSOFT FSE.

[26]  Mansour Ahmadi,et al.  Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification , 2015, CODASPY.

[27]  Jun Zhang,et al.  Detecting and Preventing Cyber Insider Threats: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[28]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.