CATCH: A detecting algorithm for coalition attacks of hit inflation in internet advertising

Abstract As the Internet flourishes, online advertising becomes essential for marketing campaigns for business applications. To perform a marketing campaign, advertisers provide their advertisements to Internet publishers and commissions are paid to the publishers of the advertisements based on the clicks made for the posted advertisements or the purchases of the products of which advertisements posted. Since the payment given to a publisher is proportional to the amount of clicks received for the advertisements posted by the publisher, dishonest publishers are motivated to inflate the number of clicks on the advertisements hosted on their web sites. Since the click frauds are critical for online advertising to be reliable, the online advertisers make the efforts to prevent them effectively. However, the methods used for click frauds are also becoming more complex and sophisticated. In this paper, we study the problem of detecting coalition attacks of click frauds. The coalition attacks of click fraud is one of the latest sophisticated techniques utilized for click frauds because the fraudsters can obtain not only more gain but also less probability of being detected by joining a coalition. We introduce new definitions for the coalition and propose the novel algorithm called CATCH to find such coalitions. Extensive experiments with synthetic and real-life data sets confirm that our notion of coalition allows us to detect coalitions much more effectively than that of previous work.

[1]  Markus Jakobsson,et al.  Secure and Lightweight Advertising on the Web , 1999, Comput. Networks.

[2]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[3]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[4]  Divyakant Agrawal,et al.  Using Association Rules for Fraud Detection in Web Advertising Networks , 2005, VLDB.

[5]  Michael K. Reiter,et al.  Detecting Hit Shaving in Click-Through Payment Schemes , 1998, USENIX Workshop on Electronic Commerce.

[6]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[7]  Laks V. S. Lakshmanan,et al.  Constraint-Based Multidimensional Data Mining , 1999, Computer.

[8]  Steve Gregory,et al.  An Algorithm to Find Overlapping Community Structure in Networks , 2007, PKDD.

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Stan Matwin,et al.  Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases , 2007 .

[11]  Akira Tanaka,et al.  The Worst-Case Time Complexity for Generating All Maximal Cliques , 2004, COCOON.

[12]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Tsuyoshi Murata,et al.  Modularities for bipartite networks , 2009, HT '09.

[15]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[16]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[17]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[18]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[19]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[20]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[22]  Ciro Cattuto,et al.  Proceedings of the 20th ACM conference on Hypertext and hypermedia , 2009 .

[23]  Divyakant Agrawal,et al.  On Hit Inflation Techniques and Detection in Streams of Web Advertising Networks , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[24]  Benny Pinkas,et al.  On the Security of Pay-per-Click and Other Web Advertising Schemes , 1999, Comput. Networks.

[25]  David J. Klein Succumbing to the dark side of the force: The Internet as seen from an adult web site , 1998 .

[26]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[28]  Divyakant Agrawal,et al.  Duplicate detection in click streams , 2005, WWW '05.

[29]  Fei Wang,et al.  Graph-Based Substructure Pattern Mining Using CUDA Dynamic Parallelism , 2013, IDEAL.

[30]  Divyakant Agrawal,et al.  SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting , 2008, Proc. VLDB Endow..

[31]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[32]  Daniel V. Klein Defending Against the Wily Surfer-Web-based Attacks and Defenses , 1999, Workshop on Intrusion Detection and Network Monitoring.

[33]  Divyakant Agrawal,et al.  Detectives: detecting coalition hit inflation attacks in advertising networks streams , 2007, WWW '07.

[34]  Yutaka I. Leon-Suematsu,et al.  A framework for fast community extraction of large-scale networks , 2008, WWW.