Detecting and Understanding Online Advertising Fraud in the Wild

SUMMARY While the online advertisement is widely used on the web and on mobile applications, the monetary damages by advertising frauds (ad frauds) have become a severe problem. Countermeasures against ad frauds are evaded since they rely on noticeable features (e.g., burstiness of ad requests) that attackers can easily change. We propose an ad-fraud- detection method that leverages robust features against attacker evasion. We designed novel features on the basis of the statistics observed in an ad network calculated from a large amount of ad requests from legitimate users, such as the popularity of publisher websites and the tendencies of client environments. We assume that attackers cannot know of or manipu-late these statistics and that features extracted from fraudulent ad requests tend to be outliers. These features are used to construct a machine-learning model for detecting fraudulent ad requests. We evaluated our proposed method by using ad-request logs observed within an actual ad network. The results revealed that our designed features improved the recall rate by 10% and had about 100,000–160,000 fewer false negatives per day than conventional features based on the burstiness of ad requests. In addition, by evalu- ating detection performance with long-term dataset, we confirmed that the proposed method is robust against performance degradation over time. Fi- nally, we applied our proposed method to a large dataset constructed on an ad network and found several characteristics of the latest ad frauds in the wild, for example, a large amount of fraudulent ad requests is sent from cloud servers.

[1]  Mitsuaki Akiyama,et al.  Precise and Robust Detection of Advertising Fraud , 2019, 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC).

[2]  Yizheng Chen,et al.  Measuring Network Reputation in the Ad-Bidding Process , 2017, DIMVA.

[3]  Yizheng Chen,et al.  Financial Lower Bounds of Online Advertising Abuse - A Four Year Case Study of the TDSS/TDL4 Botnet , 2016, DIMVA.

[4]  Niels Provos,et al.  Trends and Lessons from Three Years Fighting Malicious Extensions , 2015, USENIX Security Symposium.

[5]  Wei Meng,et al.  Understanding Malvertising Through Ad-Injecting Browser Extensions , 2015, WWW.

[6]  Tong Zhang,et al.  Crowd Fraud Detection in Internet Advertising , 2015, WWW.

[7]  Vern Paxson,et al.  Ad Injection at Scale: Assessing Deceptive Advertisement Modifications , 2015, 2015 IEEE Symposium on Security and Privacy.

[8]  Saikat Guha,et al.  Characterizing Large-Scale Click Fraud in ZeroAccess , 2014, CCS.

[9]  Ryan Stevens,et al.  MAdFraud: investigating ad fraud in android applications , 2014, MobiSys.

[10]  Yin Zhang,et al.  ViceROI: catching click-spam in search ad networks , 2013, CCS.

[11]  Yin Zhang,et al.  Measuring and fingerprinting click-spam in ad networks , 2012, SIGCOMM '12.

[12]  Christopher Krügel,et al.  Understanding fraudulent activities in online ad exchanges , 2011, IMC '11.

[13]  Vern Paxson,et al.  What's Clicking What? Techniques and Innovations of Today's Clickbots , 2011, DIMVA.

[14]  John S. Heidemann,et al.  Understanding block-level address usage in the visible internet , 2010, SIGCOMM '10.

[15]  Qifa Ke,et al.  SBotMiner: large scale search bot detection , 2010, WSDM '10.

[16]  Yong Guan,et al.  Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[17]  Zhi-Li Zhang,et al.  Identifying dynamic IP address blocks serendipitously through background scanning traffic , 2007, CoNEXT '07.

[18]  M. Goldszmidt,et al.  How dynamic are IP addresses? , 2007, SIGCOMM '07.

[19]  Divyakant Agrawal,et al.  Detectives: detecting coalition hit inflation attacks in advertising networks streams , 2007, WWW '07.

[20]  Neil Daswani,et al.  The Anatomy of Clickbot.A , 2007, HotBots.

[21]  Divyakant Agrawal,et al.  Duplicate detection in click streams , 2005, WWW '05.

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.