Fraud detection in reputation systems in e-markets using logistic regression and stepwise optimization

Reputation is the opinion of the public toward a person, a group of people, or an organization. Reputation systems are particularly important in e-markets, where they help buyers to decide whether to purchase a product or not. Since a higher reputation means more profit, some users try to deceive such systems to increase their reputation. E-markets should protect their reputation systems from attacks in order to maintain a sound environment. This work addresses the task of finding attempts to deceive reputation systems in e-markets. Our goal is to generate a list of users (sellers) ranked by the probability of fraud. Firstly we describe characteristics related to transactions that may indicate frauds evidence and they are expanded to the sellers. We describe results of a simple approach that ranks sellers by counting characteristics of fraud. Then we incorporate characteristics that cannot be used by the counting approach, and we apply logistic regression to both, improved and not improved. We use real data from a large Brazilian e-market to train and evaluate our methods and the improved set with logistic regression performs better, specially when we apply stepwise optimization. We validate our results with specialists of fraud detection in this market place. In the end, we increase by 112% the number of identified fraudsters against the reputation system. In terms of ranking, we reach 93% of average precision after specialists' review in the list that uses Logistic Regression and Stepwise optimization. We also detect 55% of fraudsters with a precision of 100%.

[1]  Virgílio A. F. Almeida,et al.  Seller's credibility in electronic markets: a complex network based approach , 2009, WICOW.

[2]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[3]  Judy E. Scott,et al.  The Role of Reputation Systems in Reducing On-Line Auction Fraud , 2006, Int. J. Electron. Commer..

[4]  Rafael Maranzato,et al.  Fraud detection in reputation systems in e-markets using logistic regression and stepwise optimization , 2010, SIAP.

[5]  M. Melnik,et al.  Does a Seller's Ecommerce Reputation Matter? Evidence from Ebay Auctions , 2003 .

[6]  Guido Dedene,et al.  A Comparison of State-of-The-Art Classification Techniques for Expert Automobile Insurance Claim Fraud Detection , 2002 .

[7]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[8]  Tomas Klos,et al.  Trusted intermediating agents in electronic trade networks , 2005, AAMAS '05.

[9]  Thuong T. Le,et al.  Pathways to Leadership for Business-to-Business Electronic Marketplaces , 2002, Electron. Mark..

[10]  May,et al.  [Wiley Series in Probability and Statistics] Applied Survival Analysis (Regression Modeling of Time-to-Event Data) || Extensions of the Proportional Hazards Model , 2008 .

[11]  James A. Ohlson FINANCIAL RATIOS AND THE PROBABILISTIC PREDICTION OF BANKRUPTCY , 1980 .

[12]  Joan Feigenbaum,et al.  Computational challenges in e-commerce , 2009, CACM.

[13]  Christopher Tucci,et al.  Reducing internet auction fraud , 2008, CACM.

[14]  Paul A. Pavlou,et al.  Evidence of the Effect of Trust Building Technology in Electronic Markets: Price Premiums and Buyer Behavior , 2002, MIS Q..

[15]  Hamid R. Nemati,et al.  Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance , 2003 .

[16]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[17]  J. Wooders,et al.  Reputation in Auctions: Theory, and Evidence from Ebay , 2006 .

[18]  Ramanathan V. Guha,et al.  Propagation of trust and distrust , 2004, WWW '04.

[19]  Rafael Maranzato,et al.  Feature Extraction for Fraud Detection in Electronic Marketplaces , 2009, 2009 Latin American Web Congress.

[20]  Stephen Marsh,et al.  Formalising Trust as a Computational Concept , 1994 .

[21]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[22]  Paul Resnick,et al.  Trust among strangers in internet transactions: Empirical analysis of eBay' s reputation system , 2002, The Economics of the Internet and E-commerce.

[23]  Virgílio A. F. Almeida,et al.  Analyzing seller practices in a Brazilian marketplace , 2009, WWW '09.

[24]  Yun Lou,et al.  Fraud Risk Factor Of The Fraud Triangle Assessing The Likelihood Of Fraudulent Financial Reporting , 2011 .

[25]  H. Akaike A new look at the statistical model identification , 1974 .

[26]  Paul Resnick,et al.  The value of reputation on eBay: A controlled experiment , 2002 .