Improving Credit Card Fraud Detection using a Meta-learning Strategy

One of the issues facing credit card fraud detection systems is that a significant percentage of transactions labeled as fraudulent are in fact legitimate. These “false alarms” delay the detection of fraudulent transactions. Analysis of 11 months of credit card transaction data from a major Canadian bank was conducted to determine savings improvements that can be achieved by identifying truly fraudulent transactions. A meta-classifier model was used in this research. This model consists of 3 base classifiers constructed using the k-nearest neighbour, decision tree, and naïve Bayesian algorithms. The naïve Bayesian algorithm was also used as the meta-level algorithm to combine the base classifier predictions to produce the final classifier. Results from this research show that when a meta-classifier was deployed in series with the Bank’s existing fraud detection algorithm a 24% to 34% performance improvement was achieved resulting in $1.8 to $2.6 million cost savings per year.

[1]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[2]  Siddhartha Bhattacharyya,et al.  Data mining for credit card fraud: A comparative study , 2011, Decis. Support Syst..

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Tao Guo,et al.  Neural data mining for credit card fraud detection , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[5]  S. Kotsiantis Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[6]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[7]  Mohd Fauzi Othman,et al.  Comparison of different classification techniques using WEKA for breast cancer , 2007 .

[8]  Jörg Kindermann,et al.  Content Classification of Multimedia Documents using Partitions of Low-Level Features , 2006, J. Virtual Real. Broadcast..

[9]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[10]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[11]  Rong-Chang Chen,et al.  Detecting Credit Card Fraud by Using Questionnaire-Responded Transaction Model Based on Support Vector Machines , 2004, IDEAL.

[12]  Wei Fan,et al.  Systematic data selection to mine concept-drifting data streams , 2004, KDD.

[13]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[14]  Chieh-Yuan Tsai,et al.  A Web services-based collaborative scheme for credit card fraud detection , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[15]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[18]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[19]  G. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 2004, Machine Learning.

[20]  Michael J. Pazzani,et al.  Error reduction through learning multiple descriptions , 2004, Machine Learning.

[21]  Richard E. Overill,et al.  Design of an artificial immune system as a novel anomaly detector for combating financial fraud in the retail sector , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[22]  Min-Jung Kim,et al.  A Neural Classifier with Fraud Density Map for Effective Credit Card Fraud Detection , 2002, IDEAL.

[23]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[24]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[25]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[26]  Dean P. Foster,et al.  Variable Selection in Data Mining , 2004 .

[27]  Soheila Ehramikar The Enhancement of Credit Card Fraud Detection Systems , 2000 .

[28]  J. Stuart Aitken,et al.  Multiple algorithms for fraud detection , 2000, Knowl. Based Syst..

[29]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[30]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[31]  Angelika I. Kokkinaki,et al.  On atypical database transactions: identification of probable frauds using machine learning for user profiling , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[32]  José R. Dorronsoro,et al.  Neural fraud detection in credit card operations , 1997, IEEE Trans. Neural Networks.

[33]  Bernd Freisleben,et al.  CARDWATCH: a neural network based database mining system for credit card fraud detection , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[34]  Salvatore J. Stolfo,et al.  Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 , 1997 .

[35]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[36]  Vijay Hanagandi,et al.  Density-based clustering and radial basis function modeling to generate credit card fraud scores , 1996, IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr).

[37]  Salvatore J. Stolfo,et al.  An extensible meta-learning approach for scalable and accurate inductive learning , 1996 .

[38]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[39]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[40]  Douglas L. Reilly,et al.  Credit card fraud detection with a neural-network , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[41]  Salvatore J. Stolfo,et al.  Experiments on multistrategy learning by meta-learning , 1993, CIKM '93.

[42]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[43]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[44]  Salvatore J. Stolfo,et al.  Speech Recognition in Parallel , 1989, HLT.

[45]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[46]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[47]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[48]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..