Evaluation of machine learning classifiers for mobile malware detection

Mobile devices have become a significant part of people’s lives, leading to an increasing number of users involved with such technology. The rising number of users invites hackers to generate malicious applications. Besides, the security of sensitive data available on mobile devices is taken lightly. Relying on currently developed approaches is not sufficient, given that intelligent malware keeps modifying rapidly and as a result becomes more difficult to detect. In this paper, we propose an alternative solution to evaluating malware detection using the anomaly-based approach with machine learning classifiers. Among the various network traffic features, the four categories selected are basic information, content based, time based and connection based. The evaluation utilizes two datasets: public (i.e. MalGenome) and private (i.e. self-collected). Based on the evaluation results, both the Bayes network and random forest classifiers produced more accurate readings, with a 99.97 % true-positive rate (TPR) as opposed to the multi-layer perceptron with only 93.03 % on the MalGenome dataset. However, this experiment revealed that the k-nearest neighbor classifier efficiently detected the latest Android malware with an 84.57 % true-positive rate higher than other classifiers.

[1]  Mi-Jung Choi,et al.  Analysis of Android malware detection performance using machine learning classifiers , 2013, 2013 International Conference on ICT Convergence (ICTC).

[2]  Stefan Kraxberger,et al.  Malware detection by applying knowledge discovery processes to application metadata on the Android Market (Google Play) , 2016, Secur. Commun. Networks.

[3]  Steve Hanna,et al.  A survey of mobile malware in the wild , 2011, SPSM '11.

[4]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[5]  Sakir Sezer,et al.  A New Android Malware Detection Approach Using Bayesian Classification , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[6]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[7]  Phurivit Sangkatsanee,et al.  Practical real-time intrusion detection using machine learning approaches , 2011, Comput. Commun..

[8]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[9]  Matthew Might,et al.  Sound and precise malware analysis for android via pushdown reachability and entry-point saturation , 2013, SPSM '13.

[10]  V. Rao Vemuri,et al.  Use of K-Nearest Neighbor classifier for intrusion detection , 2002, Comput. Secur..

[11]  Christopher Krügel,et al.  Detecting System Emulators , 2007, ISC.

[12]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[14]  Gonzalo Álvarez,et al.  MAMA: MANIFEST ANALYSIS FOR MALWARE DETECTION IN ANDROID , 2013, Cybern. Syst..

[15]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[16]  Nor Badrul Anuar,et al.  An appraisal and design of a multi-agent system based cooperative wireless intrusion detection computational intelligence technique , 2013, Eng. Appl. Artif. Intell..

[17]  Ninghui Li,et al.  Android permissions: a perspective combining risks and benefits , 2012, SACMAT '12.

[18]  HoTin Kam The Random Subspace Method for Constructing Decision Forests , 1998 .

[19]  Jules White,et al.  Applying machine learning classifiers to dynamic Android malware detection at scale , 2013, 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC).

[20]  Karsten Sohr,et al.  Software security aspects of Java-based mobile phones , 2011, SAC '11.

[21]  Jason Flinn,et al.  Virtualized in-cloud security services for mobile devices , 2008, MobiVirt '08.

[22]  Salvatore J. Stolfo,et al.  A framework for constructing features and models for intrusion detection systems , 2000, TSEC.

[23]  Tao Zhang,et al.  RobotDroid: A Lightweight Malware Detection Framework On Smartphones , 2012, J. Networks.

[24]  L. Ibrahim,et al.  Crucial Role of CD4+CD 25+ FOXP3+ T Regulatory Cell, Interferon-γ and Interleukin-16 in Malignant and Tuberculous Pleural Effusions , 2013, Immunological investigations.

[25]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[26]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[27]  Shahaboddin Shamshirband,et al.  Co-FAIS: Cooperative fuzzy artificial immune system for detecting intrusion in wireless sensor networks , 2014, J. Netw. Comput. Appl..

[28]  Ray Hunt,et al.  Intrusion detection techniques and approaches , 2002, Comput. Commun..

[29]  Daniel Curiac,et al.  Ensemble based sensing anomaly detection in wireless sensor networks , 2012, Expert Syst. Appl..

[30]  Yingxu Lai,et al.  Unknown Malicious Code Detection Based on Bayesian , 2011 .

[31]  Lior Rokach,et al.  Mobile malware detection through analysis of deviations in application network behavior , 2014, Comput. Secur..

[32]  John C. S. Lui,et al.  Droid Analytics: A Signature Based Analytic System to Collect, Extract, Analyze and Associate Android Malware , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[33]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[34]  Gianluca Dini,et al.  MADAM: A Multi-level Anomaly Detector for Android Malware , 2012, MMM-ACNS.

[35]  Elisa Bertino,et al.  Detecting mobile malware threats to homeland security through static analysis , 2014, J. Netw. Comput. Appl..

[36]  Sattar Hashemi,et al.  A graph mining approach for detecting unknown malwares , 2012, J. Vis. Lang. Comput..

[37]  Jugal K. Kalita,et al.  MLH-IDS: A Multi-Level Hybrid Intrusion Detection Method , 2014, Comput. J..

[38]  Michael Gribskov,et al.  Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..

[39]  Kent A. Spackman,et al.  Signal Detection Theory: Valuable Tools for Evaluating Inductive Learning , 1989, ML.

[40]  Shahaboddin Shamshirband,et al.  Cooperative game theoretic approach using fuzzy Q-learning for detecting and preventing intrusions in wireless sensor networks , 2014, Eng. Appl. Artif. Intell..

[41]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[42]  Ahmed Patel,et al.  An intrusion detection and prevention system in cloud computing: A systematic review , 2013, J. Netw. Comput. Appl..

[43]  Alina A. von Davier,et al.  Cross-Validation , 2014 .

[44]  Sotiris B. Kotsiantis,et al.  Machine learning: a review of classification and combining techniques , 2006, Artificial Intelligence Review.

[45]  N. B. Anuar,et al.  Identifying False Alarm for Network Intrusion Detection System Using Hybrid Data Mining and Decision Tree , 2008 .

[46]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[47]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[48]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[49]  Chun-Ying Huang,et al.  Performance Evaluation on Permission-Based Detection for Android Malware , 2013 .

[50]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[51]  M. Chuah,et al.  Smartphone Dual Defense Protection Framework: Detecting Malicious Applications in Android Markets , 2012, 2012 8th International Conference on Mobile Ad-hoc and Sensor Networks (MSN).

[52]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.