Click Stream Data Analysis for Online Fraud Detection in E-Commerce

Web services became the integration part of our life at the present time including advertisement on various web pages. Many e-commerce companies generate advertisement revenue by selling clicks (it is known as Pay-Per-Click model). In this model, e-commerce company is paid for each time an advertisement link on its website is clicked leading to the sponsoring company’s content. However, some of these companies inflate the number of clicks their sites generate. Generation of such invalid clicks either by humans or software with the intension to get fraudulently money is known as click fraud. In this article we show how the click fraud can be unmasked using various time features (e.g., period of the day and the day of the week when a user’s (that is identified by his IP address) clicking occur). We combine several different time features into a timeprint. We use machine learning methods in a number of experiments to get an understanding of to what extent time prints can be used for identifying click fraud. The obtained results show that timeprints indeed can be a useful tool for the improvement of the quality of click fraud analysis.

[1]  Divyakant Agrawal,et al.  SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting , 2008, Proc. VLDB Endow..

[2]  Hamed Haddadi,et al.  Fighting online click-fraud using bluff ads , 2010, CCRV.

[3]  Vern Paxson,et al.  What's Clicking What? Techniques and Innovations of Today's Clickbots , 2011, DIMVA.

[4]  L. Beranek,et al.  Factors influencing customer repeated purchase behavior in the e-commerce context , 2016 .

[5]  Wael Emara,et al.  Click fraud prevention in pay-per-click model: Learning through multi-model evidence fusion , 2010, 2010 International Conference on Machine and Web Intelligence.

[6]  Róbert Urbán,et al.  Morningness-Eveningness, Chronotypes and Health-Impairing Behaviors in Adolescents , 2011, Chronobiology international.

[7]  Bong-Jin Yum,et al.  Recommender system based on click stream data using association rule mining , 2011, Expert Syst. Appl..

[8]  Angelos Stavrou,et al.  NetGator: Malware Detection Using Program Interactive Challenges , 2012, DIMVA.

[9]  Divyakant Agrawal,et al.  Detectives: detecting coalition hit inflation attacks in advertising networks streams , 2007, WWW '07.

[10]  Lin Lu,et al.  Mining Significant Usage Patterns from Clickstream Data , 2005, WEBKDD.

[11]  Alexandre Gerber,et al.  Dissecting ghost clicks: ad fraud via misdirected human clicks , 2012, ACSAC '12.

[12]  Yang Wang,et al.  Patterns and Sequences: Interactive Exploration of Clickstreams to Understand Common Visitor Paths , 2017, IEEE Transactions on Visualization and Computer Graphics.

[13]  Gang Wang,et al.  Unsupervised Clickstream Clustering for User Behavior Analysis , 2016, CHI.

[14]  Paolo Giudici,et al.  Improving Web Clickstream Analysis: Markov Chains Models and Genmax Algorithms , 2008 .

[15]  Angelos Stavrou,et al.  Click Fraud Detection on the Advertiser Side , 2014, ESORICS.

[16]  Wei Lee Woon,et al.  A Novel Ensemble Learning-Based Approach for Click Fraud Detection in Mobile Advertising , 2013, MIKE.

[17]  J. Horne,et al.  A self-assessment questionnaire to determine morningness-eveningness in human circadian rhythms. , 1976, International journal of chronobiology.

[18]  David Lo,et al.  Detecting click fraud in online advertising: a data mining approach , 2014, J. Mach. Learn. Res..

[19]  Kourosh Gharachorloo,et al.  Online Advertising Fraud , 2007 .

[20]  Yin Zhang,et al.  Measuring and fingerprinting click-spam in ad networks , 2012, SIGCOMM.