An HMM and structural entropy based detector for Android malware: An empirical study

Smartphones are becoming more and more popular and, as a consequence, malware writers are increasingly engaged to develop new threats and propagate them through official and third-party markets. In addition to the propagation vectors, malware is also evolving quickly the techniques adopted for infecting victims and hiding their malicious nature to antimalware scanning. From SMS Trojans to legitimate applications repacked with malicious payload, from AES encrypted root exploits to the dynamic loading of a payload retrieved from a remote server: malicious code is becoming more and more hard to detect.In this paper we experimentally evaluate two techniques for detecting Android malware: the first one is based on Hidden Markov Model, while the second one exploits structural entropy. These two techniques have been successfully applied to detect PCs viruses in previous works, and only one work in literature analyzes the application of HMM to the detection of Android malware. We demonstrate that these methods, which reveal effective for PCs virus, are also successful for detecting and classifying mobile malware.Our results are promising: we obtain a precision of 0.96 to discriminate a malware application, and a precision of 0.978 to identify the malware family.

[1]  T. Plotz,et al.  A new approach for HMM based protein sequence family modeling and its application to remote homology classification , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[2]  Stefano Zanero,et al.  HelDroid: Dissecting and Detecting Mobile Ransomware , 2015, RAID.

[3]  Sasu Tarkoma,et al.  MDoctor: A Mobile Malware Prognosis Application , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW).

[4]  Qunfang Zhang,et al.  Mobile Phone Viruses Detection Based on HMM , 2011, 2011 Third International Conference on Multimedia Information Networking and Security.

[5]  Jean-Pierre Seifert,et al.  pBMDS: a behavior-based malware detection system for cellphone devices , 2010, WiSec '10.

[6]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[7]  Shih-Hao Hung,et al.  DroidDolphin: a dynamic Android malware detection framework using big data and machine learning , 2014, RACS '14.

[8]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[9]  Xuxian Jiang,et al.  DroidChameleon: evaluating Android anti-malware against transformation attacks , 2013, ASIA CCS '13.

[10]  Vijay Laxmi,et al.  AndroSimilar: robust statistical feature signature for Android malware detection , 2013, SIN.

[11]  Dan Arp,et al.  Drebin : � Efficient and Explainable Detection of Android Malware in Your Pocket , 2014 .

[12]  Gang Li,et al.  Malware Detection in Smartphone Using Hidden Markov Model , 2012, 2012 Fourth International Conference on Multimedia Information Networking and Security.

[13]  Thomas Schreck,et al.  Mobile-sandbox: having a deeper look into android applications , 2013, SAC '13.

[14]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[15]  Julian Schütte,et al.  On the Effectiveness of Malware Protection on Android An evaluation of Android antivirus , 2013 .

[16]  Mark Stamp,et al.  Profile hidden Markov models and metamorphic virus detection , 2009, Journal in Computer Virology.

[17]  Eduardo Juárez Martínez,et al.  Maximizing the user experience with energy-based fair sharing in battery limited mobile systems , 2013, IEEE Transactions on Consumer Electronics.

[18]  Andrew Walenstein,et al.  Detecting machine-morphed malware variants via engine attribution , 2013, Journal of Computer Virology and Hacking Techniques.

[19]  Sakir Sezer,et al.  A New Android Malware Detection Approach Using Bayesian Classification , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[20]  Eric Medvet,et al.  Effectiveness of Opcode ngrams for Detection of Multi Family Android Malware , 2015, 2015 10th International Conference on Availability, Reliability and Security.

[21]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[22]  U. Bayer,et al.  TTAnalyze: A Tool for Analyzing Malware , 2006 .

[23]  David A. Wagner,et al.  Analyzing inter-application communication in Android , 2011, MobiSys '11.

[24]  Juha Karhunen,et al.  Efficient Detection of Zero-day Android Malware Using Normalized Bernoulli Naive Bayes , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[25]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[26]  Eric Medvet,et al.  Detecting Android malware using sequences of system calls , 2015, DeMobile@SIGSOFT FSE.

[27]  Mark Stamp,et al.  Structural entropy and metamorphic malware , 2013, Journal of Computer Virology and Hacking Techniques.

[28]  Ludovic Apvrille,et al.  Identifying Unknown Android Malware with Feature Extractions and Classification Techniques , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[29]  Arun Lakhotia,et al.  DroidLegacy: Automated Familial Classification of Android Malware , 2014, PPREW'14.

[30]  Xiaojiang Du,et al.  Permission-combination-based scheme for Android mobile malware detection , 2014, 2014 IEEE International Conference on Communications (ICC).

[31]  Tudor Dumitras,et al.  Experimental Challenges in Cyber Security: A Story of Provenance and Lineage for Malware , 2011, CSET.

[32]  Pietro Lio',et al.  Unity in Diversity: Phylogenetic-inspired Techniques for Reverse Engineering and Detection of Malware Families , 2011, 2011 First SysSec Workshop.

[33]  T. Kinjo,et al.  On HMM Speech Recognition Based on Complex Speech Analysis , 2006, IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics.

[34]  Eunjin Kim,et al.  A Novel Approach to Detect Malware Based on API Call Sequence Analysis , 2015, Int. J. Distributed Sens. Networks.

[35]  Ivan Sorokin,et al.  Comparing files using structural entropy , 2011, Journal in Computer Virology.

[36]  Patrick Traynor,et al.  MAST: triage for market-scale mobile malware analysis , 2013, WiSec '13.

[37]  Igor Santos,et al.  Countering entropy measure attacks on packed software detection , 2012, 2012 IEEE Consumer Communications and Networking Conference (CCNC).

[38]  José Alberto Hernández,et al.  Android malware detection from Google Play meta-data: Selection of important features , 2015, 2015 IEEE Conference on Communications and Network Security (CNS).

[39]  Helen J. Wang,et al.  Finding diversity in remote code injection exploits , 2006, IMC '06.

[40]  Christopher Krügel,et al.  Execute This! Analyzing Unsafe and Malicious Dynamic Code Loading in Android Applications , 2014, NDSS.

[41]  Monica Borda,et al.  Fundamentals in Information Theory and Coding , 2011 .

[42]  Yang Chen,et al.  A hidden Markov model detection of malicious Android applications at runtime , 2014, 2014 23rd Wireless and Optical Communication Conference (WOCC).

[43]  Paul S. Addison,et al.  The Illustrated Wavelet Transform Handbook Introductory Theory And Applications In Science , 2002 .

[44]  Gerardo Canfora,et al.  A Classifier of Malicious Android Applications , 2013, 2013 International Conference on Availability, Reliability and Security.

[45]  Isil Dillig,et al.  Apposcopy: semantics-based detection of Android malware through static analysis , 2014, SIGSOFT FSE.

[46]  Jiqiang Liu,et al.  A Two-Layered Permission-Based Android Malware Detection Scheme , 2014, 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering.

[47]  Sakir Sezer,et al.  High accuracy android malware detection using ensemble learning , 2015, IET Inf. Secur..