Robust Android Malware Detection Based on Attributed Heterogenous Graph Embedding

While Machine learning is widely used in Android malware detection, it has been shown that machine learning based malware detection is vulnerable to adversarial attacks. Existing defense methods improve robustness at the cost of decrease in accuracy. In this paper, we propose a Heterogeneous Graph Embedding Malware Detection method, called HGEMD. It could improve both accuracy and robustness by making use of relations between apps. Specifically, we firstly extract API calls from the individual app as attribute and auxiliary information (i.e., permission, third-party library) from massive apps to construct relations. Then, we build an Attributed Heterogeneous Graph (AHG) to simultaneously model attribute and relations. Furthermore, we adopt graph convolution network and attention mechanism to fuse above heterogeneous information. Experimental results on large-scale dataset collected from Google Play demonstrate that the proposed method outperforms the state-of-the-art methods in the respect of accuracy and robustness.

[1]  Leonidas J. Guibas,et al.  PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks , 2018, ICLR.

[2]  Yanfang Ye,et al.  Out-of-sample Node Representation Learning for Heterogeneous Graph in Real-time Android Malware Detection , 2019, IJCAI.

[3]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[4]  Xiao Chen,et al.  Android HIV: A Study of Repackaging Malware for Evading Machine-Learning Detection , 2018, IEEE Transactions on Information Forensics and Security.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[7]  Stephan Günnemann,et al.  Adversarial Attacks on Neural Networks for Graph Data , 2018, KDD.

[8]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[9]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[10]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[11]  Yanfang Ye,et al.  HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network , 2017, KDD.

[12]  Win Zaw,et al.  Permission-Based Android Malware Detection , 2013 .

[13]  Philip S. Yu,et al.  Heterogeneous Information Network Embedding for Recommendation , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14]  Patrick D. McDaniel,et al.  Adversarial Examples for Malware Detection , 2017, ESORICS.

[15]  Hao Li,et al.  Understanding the Evolution of Mobile App Ecosystems: A Longitudinal Measurement Study of Google Play , 2019, WWW.

[16]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[17]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[18]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[19]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[20]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[21]  Yanfang Ye,et al.  Heterogeneous Graph Attention Network , 2019, WWW.

[22]  Yuan Qi,et al.  Cash-Out User Detection Based on Attributed Heterogeneous Information Network with a Hierarchical Attention Mechanism , 2019, AAAI.

[23]  S. Sitharama Iyengar,et al.  A Survey on Malware Detection Using Data Mining Techniques , 2017, ACM Comput. Surv..

[24]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[25]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[26]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.