A Novel Ensemble Learning-Based Approach for Click Fraud Detection in Mobile Advertising

By diverting funds away from legitimate partners (a.k.a publishers), click fraud represents a serious drain on advertising budgets and can seriously harm the viability of the internet advertising market. As such, fraud detection algorithms which can identify fraudulent behavior based on user click patterns are extremely valuable. Based on the BuzzCity dataset, we propose a novel approach for click fraud detection which is based on a set of new features derived from existing attributes. The proposed model is evaluated in terms of the resulting precision, recall and the area under the ROC curve. A final ensemble model based on 6 different learning algorithms proved to be stable with respect to all 3 performance indicators. Our final model shows improved results on training, validation and test datasets, thus demonstrating its generalizability to different datasets.

[1]  Dunja Mladenic,et al.  Feature Selection for Unbalanced Class Distribution and Naive Bayes , 1999, ICML.

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Wing-Kin Sung,et al.  SLiM on Diet: finding short linear motifs on domain interaction interfaces in Protein Data Bank , 2010, Bioinform..

[4]  Y. Sahin,et al.  Detecting credit card fraud by ANN and logistic regression , 2011, 2011 International Symposium on Innovations in Intelligent Systems and Applications.

[5]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[6]  John R. Williams,et al.  Securing Advanced Metering Infrastructure Using Intrusion Detection System with Data Stream Mining , 2012, PAISI.

[7]  Daniel Dajun Zeng,et al.  Publisher click fraud in the pay-per-click advertising market: Incentives and consequences , 2011, Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics.

[8]  Divyakant Agrawal,et al.  Duplicate detection in click streams , 2005, WWW '05.

[9]  Mu Zhu,et al.  Detection of rare items with TARGET , 2011 .

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  M. Kantardzic,et al.  Improving Click Fraud Detection by Real Time Data Fusion , 2008, 2008 IEEE International Symposium on Signal Processing and Information Technology.

[12]  Ekrem Duman,et al.  Detecting credit card fraud by decision trees and support vector machines , 2011 .