Finding the needle: A risk-based ranking of product listings at online auction sites for non-delivery fraud prediction

Non-delivery fraud is a recurring problem at online auction sites: false sellers that list nonexistent products just to receive payments and afterwards disappear, possibly repeating the swindle with another identity. In our work we identified a set of publicly available features related to listings, sellers and product categories, and built a machine learning system for fraud prediction taking into account the high class imbalance of real data and the need to control the false positives rate due to commercial reasons. We tested the proposed system with data collected from a major Brazilian online auction site, obtaining good results on the identification of fraudsters before they strike, even when they had no previous historical information. We also evaluated the contribution of category-related features to fraud detection. Finally, we compared the learning algorithm used (boosted trees) with other state-of-the-art methods.

[1]  Vinicius Almendra A Comprehensive Analysis of Nondelivery Fraud at a Major Online Auction Site , 2012 .

[2]  Vinicius Almendra,et al.  A Supervised Learning Process to Elicit Fraud Cases in Online Auction Sites , 2011, 2011 13th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[3]  Shi-Jen Lin,et al.  Combining ranking concept and social network analysis to detect collusive groups in online auctions , 2012, Expert Syst. Appl..

[4]  Rafael Maranzato,et al.  Fraud detection in reputation systems in e-markets using logistic regression and stepwise optimization , 2010, SIAP.

[5]  Adam Wierzbicki,et al.  Using Stereotypes to Identify Risky Transactions in Internet Auctions , 2010, 2010 IEEE Second International Conference on Social Computing.

[6]  Greg Ridgeway,et al.  Generalized Boosted Models: A guide to the gbm package , 2006 .

[7]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[8]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[9]  Judy E. Scott,et al.  The Role of Reputation Systems in Reducing On-Line Auction Fraud , 2006, Int. J. Electron. Commer..

[10]  Wei Chu,et al.  A machine-learned proactive moderation system for auction fraud detection , 2011, CIKM '11.

[11]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[12]  Liang Zhang,et al.  Online modeling of proactive moderation system for auction fraud detection , 2012, WWW.

[13]  Christopher Tucci,et al.  Reducing internet auction fraud , 2008, CACM.

[14]  Judy E. Scott,et al.  A typology of complaints about eBay sellers , 2008, CACM.

[15]  Chaochang Chiu,et al.  Internet Auction Fraud Detection Using Social Network Analysis and Classification Tree Approaches , 2011, Int. J. Electron. Commer..

[16]  Vinicius Almendra,et al.  A Fraudster in a Haystack: Crafting a Classifier for Non-delivery Fraud Prediction at Online Auction Sites , 2012, 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[17]  Wen-Hsi Chang,et al.  A novel two-stage phased modeling framework for early fraud detection in online auctions , 2011, Expert Syst. Appl..

[18]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[19]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[20]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[21]  Christopher Tucci,et al.  Fraudulent auctions on the Internet , 2006, Electron. Commer. Res..