Should You Consider Adware as Malware in Your Study?

Empirical validations of research approaches eventually require a curated ground truth. In studies related to Android malware, such a ground truth is built by leveraging Anti-Virus (AV) scanning reports which are often provided free through online services such as VirusTotal. Unfortunately, these reports do not offer precise information for appropriately and uniquely assigning classes to samples in app datasets: AV engines indeed do not have a consensus on specifying information in labels. Furthermore, labels often mix information related to families, types, etc. In particular, the notion of “adware” is currently blurry when it comes to maliciousness. There is thus a need to thoroughly investigate cases where adware samples can actually be associated with malware (e.g., because they are tagged as adware but could be considered as malware as well).In this work, we present a large-scale analytical study of Android adware samples to quantify to what extent “adware should be considered as malware”. Our analysis is based on the Androzoo repository of 5 million apps with associated AV labels and leverages a state-of-the-art label harmonization tool to infer the malicious type of apps before confronting it against the ad families that each adware app is associated with. We found that all adware families include samples that are actually known to implement specific malicious behavior types. Up to 50% of samples in an ad family could be flagged as malicious. Overall the study demonstrates that adware is not necessarily benign.

[1]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[2]  Jacques Klein,et al.  FraudDroid: automated ad fraud detection for Android apps , 2017, ESEC/SIGSOFT FSE.

[3]  Jacques Klein,et al.  AndroZoo++: Collecting Millions of Android Apps and Their Metadata for the Research Community , 2017, ArXiv.

[4]  Jian Liu,et al.  LibD: Scalable and Precise Third-Party Library Detection in Android Markets , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[5]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.

[6]  Jacques Klein,et al.  Static analysis of android apps: A systematic literature review , 2017, Inf. Softw. Technol..

[7]  Annamalai Narayanan,et al.  AdDetect: Automated detection of Android ad libraries using semantic analysis , 2014, 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP).

[8]  Carl A. Gunter,et al.  Free for All! Assessing User Data Exposure to Advertising Libraries on Android , 2016, NDSS.

[9]  Jacques Klein,et al.  Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[10]  Jacques Klein,et al.  An Investigation into the Use of Common Libraries in Android Apps , 2015, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[11]  Jacques Klein,et al.  Automated Testing of Android Apps: A Systematic Literature Review , 2019, IEEE Transactions on Reliability.

[12]  Thomas M. Chen,et al.  A Review of Significance of Energy-Consumption Anomaly in Malware Detection in Mobile Devices , 2016, Int. J. Cyber Situational Aware..

[13]  Christopher Krügel,et al.  Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis , 2017, NDSS.

[14]  Li Li,et al.  How do Mobile Apps Violate the Behavioral Policy of Advertisement Libraries? , 2018, HotMobile '18.

[15]  Saikat Guha,et al.  Privad: Practical Privacy in Online Advertising , 2011, NSDI.

[16]  Xuxian Jiang,et al.  Unsafe exposure analysis of mobile in-app advertisements , 2012, WISEC '12.

[17]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[18]  Helen Nissenbaum,et al.  Adnostic: Privacy Preserving Targeted Advertising , 2010, NDSS.

[19]  Michael Eichberg,et al.  CodeMatch: obfuscation won't conceal your repackaged app , 2017, ESEC/SIGSOFT FSE.

[20]  Haoyu Wang,et al.  LibRadar: Fast and Accurate Detection of Third-Party Libraries in Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[21]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[22]  Mitsuaki Akiyama,et al.  Clone or Relative?: Understanding the Origins of Similar Android Apps , 2016, IWSPA@CODASPY.