论文信息 - Random Forest Classification for Detecting Android Malware

Random Forest Classification for Detecting Android Malware

Internet connected smartphone devices play a crucial role in the application domain of Internet of Things. These devices are being widely used for day-to-day activities such as remotely controlling lighting and heating at homes, paying for parking, and recently for paying for goods using saved credit card information using Near Field Communication (NFC). Android is the most popular smartphone platform today. It is also the choice of malware authors to obtain secure and private data. In this paper we exclusively apply the machine learning ensemble learning algorithm Random Forest supervised classifier on an Android feature dataset of 48919 points of 42 features each. Our goal was to measure the accuracy of Random Forest in classifying Android application behavior to classify applications as malicious or benign. Moreover, we wanted to focus on detection accuracy as the free parameters of the Random Forest algorithm such as the number of trees, depth of each tree and number of random features selected are varied. Our experimental results based on 5-fold cross validation of our dataset shows that Random Forest performs very well with an accuracy of over 99 percent in general, an optimal Out-Of-Bag (OOB) error rate [3] of 0.0002 for forests with 40 trees or more, and a root mean squared error of 0.0171 for 160 trees.

Mohammed S. Alam | Son Thanh Vuong | S. Vuong | M. Alam

[1] Antonio Criminisi,et al. Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[2] Yuval Elovici,et al. “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[3] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[4] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[5] Simin Nadjm-Tehrani,et al. Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[6] Win Zaw,et al. Permission-Based Android Malware Detection , 2013 .

[7] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[8] Ludmila I. Kuncheva,et al. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[9] Lei-da Chen,et al. Mobile Payment Adoption in the US: A Cross-industry, Crossplatform Solution , 2005 .

[10] Sehun Kim,et al. A Malicious Application Detection Framework using Automatic Feature Extraction Tool on Android Market , 2013 .

[11] Gianluca Dini,et al. MADAM: A Multi-level Anomaly Detector for Android Malware , 2012, MMM-ACNS.