URefFlow: A Unified Android Malware Detection Model Based on Reflective Calls

In Android malware detection, sensitive data-flows provide more accurate information on the application's behavior than regular features such as signatures and permissions. Currently, Android static taint analysis is widely adopted to identify sensitive data-flows because of its high code coverage and low false negative rate. However, existing static taint analysis tools cannot effectively analyze applications that adopt Android reflection mechanism. Reflection mechanism can block the control-flows and data-flows of the application. When constructing a call graph, the call information will point directly to the system's reflection processing method, rather than the actual method invoked by the application. This significantly affects the accurate representation of the application's behavior. To address this issue, this paper proposes a unified Android malware detection model based on reflective calls named URefFlow, in which the reflective call statement is replaced by the non-reflective call statement to make the reflective calls explicit by combining the parameters of the reflective calls into standard function calls. After extracting the complete sensitive data-flows with reflective calls from an application, we analyze the characteristics of these data-flows to determine whether the application is malicious. Evaluation results on thousands of applications show that URefFlow can achieve an impressive detection accuracy of 95.6% with a false positive rate of 0.8%. In addition, the proposed approach complements well with existing static stain analysis techniques.

[1]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[2]  Yanick Fratantonio,et al.  ANDRUBIS -- 1,000,000 Apps Later: A View on Current Android Malware Behaviors , 2014, 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS).

[3]  Johannes Köstler,et al.  Kynoid: Real-time enforcement of fine-grained, user-defined, and data-centric security policies for Android , 2013, Inf. Secur. Tech. Rep..

[4]  Ling Huang,et al.  Morpheus: benchmarking computational diversity in mobile malware , 2014, HASP@ISCA.

[5]  Byung-Gon Chun,et al.  TaintDroid: an information flow tracking system for real-time privacy monitoring on smartphones , 2014, Commun. ACM.

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Jacques Klein,et al.  DroidRA: taming reflection to support whole-program analysis of Android apps , 2016, ISSTA.

[8]  Jacques Klein,et al.  Reflection-aware static analysis of Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  Di Wu,et al.  DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[10]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[11]  Eric Bodden,et al.  A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks , 2014, NDSS.

[12]  Qinghua Zheng,et al.  Android Malware Familial Classification and Representative Sample Selection via Frequent Subgraph Analysis , 2018, IEEE Transactions on Information Forensics and Security.

[13]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[17]  Peiyuan Zong,et al.  SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits , 2017, CCS.