Performance criteria for plastic card fraud detection tools

In predictive data mining, algorithms will be both optimized and compared using a measure of predictive performance. Different measures will yield different results, and it follows that it is crucial to match the measure to the true objectives. In this paper, we explore the desirable characteristics of measures for constructing and evaluating tools for mining plastic card data to detect fraud. We define two measures, one based on minimizing the overall cost to the card company, and the other based on minimizing the amount of fraud given the maximum number of investigations the card company can afford to make. We also describe a plot, analogous to the standard ROC, for displaying the performance trace of an algorithm as the relative costs of the two different kinds of misclassification—classing a fraudulent transaction as legitimate or vice versa—are varied.

[1]  D. Hand,et al.  Scorecard construction with unbalanced class sizes , 2003 .

[2]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[3]  R. Tripathi,et al.  The Effect of Errors in Diagnosis and Measurement on the Estimation of the Probability of an Event , 1980 .

[4]  Tom Fawcett,et al.  Fraud detection , 2002 .

[5]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[6]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[7]  D. J. Hand,et al.  Good practice in retail credit scorecard assessment , 2005, J. Oper. Res. Soc..

[8]  Adrien Jamain A Meta-Analysis of Classification Methods , 2006 .

[9]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[10]  AlahakoonDamminda,et al.  Minority report in fraud detection , 2004 .

[11]  Rakesh Agrawal,et al.  Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining , 1998, KDD 1998.

[12]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[13]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[14]  Niall M. Adams,et al.  Comparing classifiers when the misallocation costs are uncertain , 1999, Pattern Recognit..

[15]  P. Lachenbruch Discriminant Analysis When the Initial Samples Are Misclassified , 1966 .

[16]  David J. Hand,et al.  Measuring Diagnostic Accuracy of Statistical Prediction Rules , 2001 .