SECUR-AMA: Active Malware Analysis Based on Monte Carlo Tree Search for Android Systems

Abstract We propose SECUR-AMA, an Active Malware Analysis (AMA) framework for Android. (AMA) is a technique that aims at acquiring knowledge about target applications by executing actions on the system that trigger responses from the targets. The main strength of this approach is the capability of extracting behaviors that would otherwise remain invisible. A key difference from other analysis techniques is that the triggering actions are not selected randomly or sequentially, but following strategies that aim at maximizing the information acquired about the behavior of the target application. Specifically, we design SECUR-AMA as a framework implementing a stochastic game between two agents: an analyzer and a target application. The strategy of the analyzer consists in a reinforcement learning algorithm based on Monte Carlo Tree Search (MCTS) to efficiently search the state and action spaces taking into account previous interactions in order to obtain more information on the target. The target model instead is created online while playing the game, using the information acquired so far by the analyzer and using it to guide the remainder of the analysis in an iterative process. We conduct an extensive evaluation of SECUR-AMA analyzing about 1200 real Android malware divided into 24 families (classes) from a publicly available dataset, and we compare our approach with multiple state-of-the-art techniques of different types, including passive and active approaches. Results show that SECUR-AMA creates more informative models that allow to reach better classification results for most of the malware families in our dataset.

[1]  Vinod Yegneswaran,et al.  Eureka: A Framework for Enabling Static Malware Analysis , 2008, ESORICS.

[2]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[3]  Alessandro Farinelli,et al.  A Monte Carlo Tree Search approach to Active Malware Analysis , 2017, IJCAI.

[4]  Debin Gao,et al.  Active malware analysis using stochastic games , 2012, AAMAS.

[5]  Vrizlynn L. L. Thing,et al.  A scalable and extensible framework for android malware detection and family attribution , 2019, Comput. Secur..

[6]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[7]  Andrew Walenstein,et al.  The Software Similarity Problem in Malware Analysis , 2006, Duplication, Redundancy, and Similarity in Software.

[8]  Konrad Rieck,et al.  Structural detection of android malware using embedded call graphs , 2013, AISec.

[9]  Juan E. Tapiador,et al.  Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families , 2014, Expert Syst. Appl..

[10]  Mauro Conti,et al.  Detecting Targeted Smartphone Malware with Behavior-Triggering Stochastic Models , 2014, ESORICS.

[11]  Yang Liu,et al.  Semantic modelling of Android malware for effective malware comprehension, detection, and classification , 2016, ISSTA.

[12]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[13]  David Camacho,et al.  Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset , 2019, Inf. Fusion.

[14]  Chao Yang,et al.  DroidMiner: Automated Mining and Characterization of Fine-grained Malicious Behaviors in Android Applications , 2014, ESORICS.

[15]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[16]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[17]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[18]  Jinbo Bi,et al.  Active learning via transductive experimental design , 2006, ICML.

[19]  David Camacho,et al.  CANDYMAN: Classifying Android malware families by modelling dynamic traces with Markov chains , 2018, Eng. Appl. Artif. Intell..

[20]  Roberto Giacobazzi,et al.  Active Android malware analysis: an approach based on stochastic games , 2016, SSPREW '16.

[21]  Sankardas Roy,et al.  Deep Ground Truth Analysis of Current Android Malware , 2017, DIMVA.

[22]  Gianluca Stringhini,et al.  MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models (Extended Version) , 2016, NDSS 2017.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Vijay Laxmi,et al.  SWORD: Semantic aWare andrOid malwaRe Detector , 2018, J. Inf. Secur. Appl..

[25]  Arun Lakhotia,et al.  Fast location of similar code fragments using semantic 'juice' , 2013, PPREW '13.

[26]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[27]  Juan E. Tapiador,et al.  Picking on the family: Disrupting android malware triage by forcing misclassification , 2018, Expert Syst. Appl..

[28]  José Alberto Hernández,et al.  Machine-Learning based analysis and classification of Android malware signatures , 2019, Future Gener. Comput. Syst..

[29]  Dongho Won,et al.  Malware Variant Detection and Classification Using Control Flow Graph , 2011, ICHIT.

[30]  Lior Rokach,et al.  Novel active learning methods for enhanced PC malware detection in windows OS , 2014, Expert Syst. Appl..

[31]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[32]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.