An Android Malware Detection Approach Using Bayesian Inference

Android malware detection has been a popularresearch topic due to non-negligible amount of malwaretargeting the Android operating system. In particular, thenaive Bayes generative classifier is a common techniquewidely adopted in many papers. However, we found thatthe naive Bayes classifier performs badly in ContagioMalware Dump dataset, which could result from theassumption that no feature dependency exists. In this paper, we propose a lightweight method for An-droid malware detection, which improves the performanceof Bayesian classification on the Contagio Malware Dumpdataset. It performs static analysis to gather malicious fea-tures from an application, and applies principal componentanalysis to reduce the dependencies among them. Withthe hidden naive Bayes model, we can infer the identityof the application. In an evaluation with 15,573 normalapplications and 3,150 malicious samples, our work detects94.5% of the malware with a false positive rate of 1.0%.The experiment also shows that our approach is feasibleon smartphones.

[1]  Ninghui Li,et al.  Using probabilistic generative models for ranking risks of Android apps , 2012, CCS.

[2]  Xiaojiang Du,et al.  Permission-combination-based scheme for Android mobile malware detection , 2014, 2014 IEEE International Conference on Communications (ICC).

[3]  Anthony Desnos Android: From Reversing to Decompilation , 2011 .

[4]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[5]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[6]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[7]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[8]  Michalis Faloutsos,et al.  Permission evolution in the Android ecosystem , 2012, ACSAC '12.

[9]  Gonzalo Álvarez,et al.  MAMA: MANIFEST ANALYSIS FOR MALWARE DETECTION IN ANDROID , 2013, Cybern. Syst..

[10]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[11]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[12]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[13]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[14]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[15]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[16]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[17]  Steve Hanna,et al.  A survey of mobile malware in the wild , 2011, SPSM '11.

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[20]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[21]  Byung-Gon Chun,et al.  TaintDroid: an information flow tracking system for real-time privacy monitoring on smartphones , 2014, Commun. ACM.

[22]  Patrick D. McDaniel,et al.  On lightweight mobile phone application certification , 2009, CCS.

[23]  Sakir Sezer,et al.  A New Android Malware Detection Approach Using Bayesian Classification , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[24]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[25]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[26]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[27]  Steve Hanna,et al.  Android permissions demystified , 2011, CCS '11.