Andro_MD: Android Malware Detection based on Convolutional Neural Networks

Android OS maintains its dominance in smart terminal markets, which brings growing threats of malicious applications (apps). The research on Android malware detection has attracted attention from both academia and industry. How to improve the malware detection performance, what classifiers should be selected, and what features should be employed are all critical issues that need to be solved. Convolutional Neural Networks (CNN) is a typical deep learning technique that has achieved great performance in image and speech recognitions. In this work, we present an Android malware detection framework Andro_MD that can train and classify samples with a deep learning technique. The framework includes dataset construction and feature preprocessing, training and classification by CNN, and evaluation by experiments. First, an Android app dataset is constructed with 21,000 samples collected from third-party markets and 34,570 features of 7 categories. Second, we employ sequential and parallel models to train the extracted features and classify the malware apps. Finally, extensive experimental results show the effectiveness and feasibility of the proposed method. Comparisons with similar work and traditional classifiers show that Andro_MD has a better performance on malware detection, and its accuracy can achieve 99.25% with a FPR of 0.53%. The "request permissions" and "used permissions" feature categories have better performances with limited dimensions.

[1]  Peng Wang,et al.  AsDroid: detecting stealthy behaviors in Android applications by user interface and program behavior contradiction , 2014, ICSE.

[2]  Xun Li,et al.  Effective detection of android malware based on the usage of data flow APIs and machine learning , 2016, Inf. Softw. Technol..

[3]  Paul C. van Oorschot,et al.  A methodology for empirical analysis of permission-based security models and its application to android , 2010, CCS '10.

[4]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[5]  Mohd Faizal Abdollah,et al.  Analysis of Features Selection and Machine Learning Classifier in Android Malware Detection , 2014, 2014 International Conference on Information Science & Applications (ICISA).

[6]  Shashi Shekhar,et al.  QUIRE: Lightweight Provenance for Smart Phone Operating Systems , 2011, USENIX Security Symposium.

[7]  Zhenlong Yuan,et al.  DroidDetector: Android Malware Characterization and Detection Using Deep Learning , 2016 .

[8]  Ivor W. Tsang,et al.  Learning word dependencies in text by means of a deep recurrent belief network , 2016, Knowl. Based Syst..

[9]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Xuelong Li,et al.  Blind Image Quality Assessment via Deep Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[12]  Xiangliang Zhang,et al.  Detecting Android malicious apps and categorizing benign apps with ensemble of classifiers , 2018, Future Gener. Comput. Syst..

[13]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[14]  Steve Hanna,et al.  Android permissions demystified , 2011, CCS '11.

[15]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[16]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[17]  Zhenlong Yuan,et al.  Droid-Sec: deep learning in android malware detection , 2015, SIGCOMM 2015.

[18]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[19]  Yuval Elovici,et al.  “Andromaly”: a behavioral malware detection framework for android devices , 2012, Journal of Intelligent Information Systems.

[20]  DeLiang Wang,et al.  A Deep Ensemble Learning Method for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21]  Brian Kan-Wing Mak,et al.  Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  Siu-Ming Yiu,et al.  DroidChecker: analyzing android applications for capability leak , 2012, WISEC '12.

[23]  Zongtian Liu,et al.  Event Recognition Based on Deep Learning in Chinese Texts , 2016, PloS one.

[24]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[25]  Sencun Zhu,et al.  Alde: Privacy Risk Analysis of Analytics Libraries in the Android Ecosystem , 2016, SecureComm.

[26]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[27]  Patrick D. McDaniel,et al.  On lightweight mobile phone application certification , 2009, CCS.

[28]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Wenke Lee,et al.  CHEX: statically vetting Android apps for component hijacking vulnerabilities , 2012, CCS.

[31]  Xiangliang Zhang,et al.  Exploring Permission-Induced Risk in Android Applications for Malicious Application Detection , 2014, IEEE Transactions on Information Forensics and Security.

[32]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[33]  Kamil Akhuseyinoglu,et al.  AntiWare: An automated Android malware detection tool based on machine learning approach and official market metadata , 2016, 2016 IEEE 7th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[34]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[35]  Hung-Min Sun,et al.  An Android Behavior-Based Malware Detection Method using Machine Learning , 2016, 2016 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC).

[36]  Nathan S. Netanyahu,et al.  DeepSign: Deep learning for automatic malware signature generation and classification , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).