A Machine Learning Approach to Android Malware Detection

With the recent emergence of mobile platforms capable of executing increasingly complex software and the rising ubiquity of using mobile platforms in sensitive applications such as banking, there is a rising danger associated with malware targeted at mobile devices. The problem of detecting such malware presents unique challenges due to the limited resources avalible and limited privileges granted to the user, but also presents unique opportunity in the required metadata attached to each application. In this article, we present a machine learning-based system for the detection of malware on Android devices. Our system extracts a number of features and trains a One-Class Support Vector Machine in an offline (off-device) manner, in order to leverage the higher computing power of a server or cluster of servers.

[1]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[2]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[3]  Jeffrey O. Kephart,et al.  Biologically Inspired Defenses Against Computer Viruses , 1995, IJCAI.

[4]  Karl N. Levitt,et al.  MCF: a malicious code filter , 1995, Comput. Secur..

[5]  Gerald Tesauro,et al.  Neural networks for computer virus recognition , 1996 .

[6]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[7]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[8]  Salvatore J. Stolfo,et al.  Data mining methods for detection of new malicious executables , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[9]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[10]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[11]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Robert D. Nowak,et al.  A Neyman-Pearson approach to statistical learning , 2005, IEEE Transactions on Information Theory.

[13]  Siwei Lyu,et al.  Mercer kernels for object recognition with local features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Francesca Odone,et al.  Building kernels from binary strings for image matching , 2005, IEEE Transactions on Image Processing.

[15]  Andrew Walenstein,et al.  Normalizing Metamorphic Malware Using Term Rewriting , 2006, 2006 Sixth IEEE International Workshop on Source Code Analysis and Manipulation.

[16]  Marcus A. Maloof,et al.  Learning to Detect and Classify Malicious Executables in the Wild , 2006, J. Mach. Learn. Res..

[17]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[18]  Guillaume Bonfante,et al.  Architecture of a morphological malware detector , 2009, Journal in Computer Virology.

[19]  Kang G. Shin,et al.  Detecting energy-greedy anomalies and mobile malware variants , 2008, MobiSys '08.

[20]  Kang G. Shin,et al.  Behavioral detection of malware on mobile handsets , 2008, MobiSys '08.

[21]  Lei Liu,et al.  VirusMeter: Preventing Your Cellphone from Spies , 2009, RAID.

[22]  Richard G. Baraniuk,et al.  Tuning Support Vector Machines for Minimax and Neyman-Pearson Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jean-Pierre Seifert,et al.  pBMDS: a behavior-based malware detection system for cellphone devices , 2010, WiSec '10.

[24]  Gilles Blanchard,et al.  Semi-Supervised Novelty Detection , 2010, J. Mach. Learn. Res..

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[27]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[28]  Swarat Chaudhuri,et al.  A Study of Android Application Security , 2011, USENIX Security Symposium.

[29]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[30]  Shivakant Mishra,et al.  Location based power analysis to detect malicious code in smartphones , 2011, SPSM '11.

[31]  Yajin Zhou,et al.  Hey, You, Get Off of My Market: Detecting Malicious Apps in Official and Alternative Android Markets , 2012, NDSS.