Machine learning aided Android malware classification

The widespread adoption of Android devices and their capability to access significant private and confidential information have resulted in these devices being targeted by malware developers. Existing Android malware analysis techniques can be broadly categorized into static and dynamic analysis. In this paper, we present two machine learning aided approaches for static analysis of Android malware. The first approach is based on permissions and the other is based on source code analysis utilizing a bag-of-words representation model. Our permission-based model is computationally inexpensive, and is implemented as the feature of OWASP Seraphimdroid Android app that can be obtained from Google Play Store. Our evaluations of both approaches indicate an F-score of 95.1% and F-measure of 89% for the source code-based classification and permission-based classification models, respectively.

[1]  Kim-Kwang Raymond Choo,et al.  How Cyber-Savvy are Older Mobile Device Users? , 2017, Mobile Security and Privacy.

[2]  Ayumu Kubota,et al.  Kernel-based Behavior Analysis for Android Malware Detection , 2011, 2011 Seventh International Conference on Computational Intelligence and Security.

[3]  Mourad Debbabi,et al.  Fingerprinting Android packaging: Generating DNAs for malware detection , 2016, Digit. Investig..

[4]  Latifur Khan,et al.  A Machine Learning Approach to Android Malware Detection , 2012, 2012 European Intelligence and Security Informatics Conference.

[5]  Sakir Sezer,et al.  Android malware detection: An eigenspace analysis approach , 2015, 2015 Science and Information Conference (SAI).

[6]  Vitor Monte Afonso,et al.  Identifying Android malware using dynamically obtained features , 2014, Journal of Computer Virology and Hacking Techniques.

[7]  B. S. Manjunath,et al.  Malware images: visualization and automatic classification , 2011, VizSec '11.

[8]  Sahin Albayrak,et al.  Detecting Symbian OS malware through static function call analysis , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[9]  Babu M. Mehtre,et al.  Static Malware Analysis Using Machine Learning Methods , 2014, SNDS.

[10]  Ali Dehghantanha,et al.  Privacy-respecting digital investigation , 2014, 2014 Twelfth Annual International Conference on Privacy, Security and Trust.

[11]  Joseph G. Tront,et al.  Mobile Device Profiling and Intrusion Detection Using Smart Batteries , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[12]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[13]  Ali Dehghantanha,et al.  M0Droid: An Android Behavioral-Based Malware Detection Model , 2015 .

[14]  Antonella Santone,et al.  Download Malware? No, Thanks. How Formal Methods Can Block Update Attacks , 2016, 2016 IEEE/ACM 4th FME Workshop on Formal Methods in Software Engineering (FormaliSE).

[15]  Mansour Ahmadi,et al.  DroidScribe: Classifying Android Malware Based on Runtime Behavior , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[16]  Mohammed S. Alam,et al.  Random Forest Classification for Detecting Android Malware , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[17]  Kim-Kwang Raymond Choo,et al.  A Review of Free Cloud-Based Anti-Malware Apps for Android , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[18]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[19]  Gerardo Canfora,et al.  Mobile malware detection using op-code frequency histograms , 2015, 2015 12th International Joint Conference on e-Business and Telecommunications (ICETE).

[20]  Fingerprinting Android packaging: Generating DNAs for malware detection , 2016 .

[21]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[22]  Hsinchun Chen,et al.  Machine learning for attack vector identification in malicious source code , 2013, 2013 IEEE International Conference on Intelligence and Security Informatics.

[23]  William R. Hersh,et al.  Evaluation of biomedical text-mining systems: Lessons learned from information retrieval , 2005, Briefings Bioinform..

[24]  Antonella Santone,et al.  Download malware? no, thanks: how formal methods can block update attacks , 2016, FM 2016.