Scraping and Preprocessing Commercial Auction Data for Fraud Classification

In the last three decades, we have seen a significant increase in trading goods and services through online auctions. However, this business created an attractive environment for malicious moneymakers who can commit different types of fraud activities, such as Shill Bidding (SB). The latter is predominant across many auctions but this type of fraud is difficult to detect due to its similarity to normal bidding behaviour. The unavailability of SB datasets makes the development of SB detection and classification models burdensome. Furthermore, to implement efficient SB detection models, we should produce SB data from actual auctions of commercial sites. In this study, we first scraped a large number of eBay auctions of a popular product. After preprocessing the raw auction data, we build a high-quality SB dataset based on the most reliable SB strategies. The aim of our research is to share the preprocessed auction dataset as well as the SB training (unlabelled) dataset, thereby researchers can apply various machine learning techniques by using authentic data of auctions and fraud.

[1]  Jun-Lin Lin,et al.  Online Auction Fraud Detection in Privacy-Aware Reputation Systems , 2017, Entropy.

[2]  Haiping Xu,et al.  Reasoning under Uncertainty for Shill Detection in Online Auctions Using Dempster-Shafer Theory , 2010, Int. J. Softw. Eng. Knowl. Eng..

[3]  Samira Sadaoui,et al.  A dynamic stage-based fraud monitoring framework of multiple live auctions , 2016, Applied Intelligence.

[4]  Iren Valova,et al.  Identifying Suspicious Bidders Utilizing Hierarchical Clustering and Decision Trees , 2010, IC-AI.

[5]  Samira Sadaoui,et al.  Classification of Imbalanced Auction Fraud Data , 2017, Canadian Conference on AI.

[6]  Samira Sadaoui,et al.  An Empirical Analysis of Imbalanced Data Classification , 2015, Comput. Inf. Sci..

[7]  Haiping Xu,et al.  Combating online in-auction fraud: Clues, techniques and challenges , 2009, Comput. Sci. Rev..

[8]  Jarrod Trevathan Getting into the mind of an "in-auction" fraud perpetrator , 2018, Comput. Sci. Rev..

[9]  Tsuyoshi Murata,et al.  Two Step graph-based semi-supervised Learning for Online Auction Fraud Detection , 2015, ECML/PKDD.

[10]  Jarrod Trevathan,et al.  A Software Tool for Collecting Data from Online Auctions , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[11]  Wen-Hsi Chang,et al.  Analysis of fraudulent behavior strategies in online auctions for detecting latent fraudsters , 2014, Electron. Commer. Res. Appl..

[12]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[13]  Shi-Jen Lin,et al.  Parallel Crawling and Capturing for On-Line Auction , 2008, ISI Workshops.

[14]  Iren Valova,et al.  A Real-Time Self-Adaptive Classifier for Identifying Suspicious Bidders in Online Auctions , 2013, Comput. J..