Re-checking App Behavior against App Description in the Context of Third-party Libraries

Recent research suggested promising approaches that identify potential malware by checking the inconsistence between app description and actual behavior of the app. However, state-of-the-art approaches have ignored the impact of thirdparty libraries (TPLs) when detecting outliers, which could affect the detection results greatly in two folds. On one hand, most Android apps would not list the functionality of TPLs in app description, which could cause false positives, as many apps that use TPLs will be identified as outliers. On the other hand, it is important to separate TPLs from custom code when analyzing the sensitive behaviors, otherwise the malicious behaviors of custom code will be obscured by TPLs. In this paper, we revisit the study of checking app behavior against app description in the context of TPLs. Experiment results on more than 400K Android apps suggest that more than 54% of apps are no longer identified as outliers after filtering TPLs, and we could identify roughly 50% of new outliers. Furthermore, removing the impact of TPLs could help to identify malware and pinpoint the malicious behavior of custom code. Out results shed a light on applying the TPL analysis to enhance a variety of mobile app analysis tasks.

[1]  Hao Chen,et al.  Investigating User Privacy in Android Ad Libraries , 2012 .

[2]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[3]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[4]  Haoyu Wang,et al.  Understanding the Purpose of Permission Use in Mobile Apps , 2017, ACM Trans. Inf. Syst..

[5]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[6]  Robert H. Deng,et al.  Active Semi-supervised Approach for Checking App Behavior against Its Description , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[7]  Jacques Klein,et al.  Characterizing malicious Android apps by mining topic-specific data flow signatures , 2017, Inf. Softw. Technol..

[8]  Haoyu Wang,et al.  Reevaluating Android Permission Gaps with Static and Dynamic Analysis , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[9]  Xuxian Jiang,et al.  Unsafe exposure analysis of mobile in-app advertisements , 2012, WISEC '12.

[10]  Li Li,et al.  How do Mobile Apps Violate the Behavioral Policy of Advertisement Libraries? , 2018, HotMobile '18.

[11]  Alessandra Gorla,et al.  Checking app behavior against app descriptions , 2014, ICSE.

[12]  Hareton K. N. Leung,et al.  Enhancing the Description-to-Behavior Fidelity in Android Apps with Privacy Policy , 2018, IEEE Transactions on Software Engineering.

[13]  Zhong Chen,et al.  AutoCog: Measuring the Description-to-permission Fidelity in Android Applications , 2014, CCS.

[14]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[15]  Norman M. Sadeh,et al.  Expectation and purpose: understanding users' mental models of mobile app privacy through crowdsourcing , 2012, UbiComp.

[16]  Tao Xie,et al.  WHYPER: Towards Automating Risk Assessment of Mobile Applications , 2013, USENIX Security Symposium.

[17]  Steve Hanna,et al.  A survey of mobile malware in the wild , 2011, SPSM '11.

[18]  Haoyu Wang,et al.  Understanding Third-Party Libraries in Mobile App Analysis , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[19]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[20]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[21]  Haoyu Wang,et al.  Towards Light-Weight Deep Learning Based Malware Detection , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[22]  Li Li,et al.  Why are Android Apps Removed From Google Play? A Large-Scale Empirical Study , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[23]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[24]  Haoyu Wang,et al.  Identifying and Analyzing the Privacy of Apps for Kids , 2016, HotMobile.

[25]  Haoyu Wang,et al.  Using text mining to infer the purpose of permission use in mobile apps , 2015, UbiComp.

[26]  Yajin Zhou,et al.  RiskRanker: scalable and accurate zero-day android malware detection , 2012, MobiSys '12.

[27]  Haoyu Wang,et al.  WuKong: a scalable and accurate two-phase approach to Android app clone detection , 2015, ISSTA.

[28]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[29]  Haoyu Wang,et al.  An Explorative Study of the Mobile App Ecosystem from App Developers' Perspective , 2017, WWW.

[30]  Haoyu Wang,et al.  Detecting repackaged Android applications based on code clone detection technique , 2014 .