Dynamic Android Malware Classification Using Graph-Based Representations

Malware classification for the Android ecosystem can be performed using a range of techniques. One major technique that has been gaining ground recently is dynamic analysis based on system call invocations recorded during the executions of Android applications. Dynamic analysis has traditionally been based on converting system calls into flat feature vectors and feeding the vectors into machine learning algorithms for classification. In this paper, we implement three traditional feature-vector-based representations for Android system calls. For each feature vector representation, we also propose a novel graph-based representation. We then use graph kernels to compute pair-wise similarities and feed these similarity measures into a Support Vector Machine (SVM) for classification. To speed up the graph kernel computation, we compress the graphs using the Compressed Row Storage format, and then we apply OpenMP to parallelize the computation. Experiments show that the graph-based representations are able to improve the classification accuracy over the corresponding feature-vector-based representations from the same input. Finally we show that different representations can be combined together to further improve classification accuracy.

[1]  Radu State,et al.  Malware analysis with graph kernels and support vector machines , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[2]  Wei Wang,et al.  Parallelization of Shortest Path Graph Kernels on Multi-Core CPUs and GPUs , 2013 .

[3]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[4]  Sotiris Ioannidis,et al.  Rage against the virtual machine: hindering dynamic analysis of Android malware , 2014, EuroSec '14.

[5]  Christian Platzer,et al.  MARVIN: Efficient and Comprehensive Mobile App Classification through Static and Dynamic Analysis , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[6]  Gerardo Canfora,et al.  A Classifier of Malicious Android Applications , 2013, 2013 International Conference on Availability, Reliability and Security.

[7]  Franklin Tchakounté,et al.  System Calls Analysis of Malwares on Android , 2013 .

[8]  Xuxian Jiang,et al.  Catch Me If You Can: Evaluating Android Anti-Malware Against Transformation Attacks , 2014, IEEE Transactions on Information Forensics and Security.

[9]  Thomas Schreck,et al.  Mobile-Sandbox: combining static and dynamic analysis with machine-learning techniques , 2015, International Journal of Information Security.

[10]  Z. Rakamaric,et al.  Android Malware Detection Based on System Calls , 2015 .

[11]  S. V. N. Vishwanathan,et al.  SPF-GMKL: generalized multiple kernel learning with a million kernels , 2012, KDD.

[12]  You Joung Ham,et al.  Detection of Malicious Android Mobile Applications Based on Aggregated System Call Events , 2014 .

[13]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[14]  L. Cavallaro,et al.  A System Call-Centric Analysis and Stimulation Technique to Automatically Reconstruct Android Malware Behaviors , 2013 .

[15]  Yanick Fratantonio,et al.  ANDRUBIS -- 1,000,000 Apps Later: A View on Current Android Malware Behaviors , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[16]  Wei Yu,et al.  On behavior-based detection of malware on Android platform , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).

[17]  Curtis B. Storlie,et al.  Graph-based malware detection using dynamic analysis , 2011, Journal in Computer Virology.

[18]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[19]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[20]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[21]  Yanick Fratantonio,et al.  Andrubis: Android Malware Under the Magnifying Glass , 2014 .

[22]  You Joung Ham,et al.  Android Mobile Application System Call Event Pattern Analysis for Determination of Malicious Attack , 2014 .

[23]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[24]  Christopher Krügel,et al.  A quantitative study of accuracy in system call-based malware detection , 2012, ISSTA 2012.

[25]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[26]  Simin Nadjm-Tehrani,et al.  Crowdroid: behavior-based malware detection system for Android , 2011, SPSM '11.

[27]  Moshe Kam,et al.  Run-time classification of malicious processes using system call analysis , 2015, 2015 10th International Conference on Malicious and Unwanted Software (MALWARE).

[28]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[29]  Arun K. Pujari,et al.  New Malicious Code Detection Using Variable Length n-grams , 2006, ICISS.