A customized classification algorithm for credit card fraud detection

Abstract This paper presents Fraud-BNC, a customized Bayesian Network Classifier (BNC) algorithm for a real credit card fraud detection problem. The task of creating Fraud-BNC was automatically performed by a Hyper-Heuristic Evolutionary Algorithm (HHEA), which organizes the knowledge about the BNC algorithms into a taxonomy and searches for the best combination of these components for a given dataset. Fraud-BNC was automatically generated using a dataset from PagSeguro, the most popular Brazilian online payment service, and tested together with two strategies for dealing with cost-sensitive classification. Results obtained were compared to seven other algorithms, and analyzed considering the data classification problem and the economic efficiency of the method. Fraud-BNC presented itself as the best algorithm to provide a good trade-off between both perspectives, improving the current company’s economic efficiency in up to 72.64%.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Maumita Bhattacharya,et al.  Intelligent Financial Fraud Detection: A Comprehensive Review , 2015 .

[3]  Gisele L. Pappa,et al.  Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach , 2009 .

[4]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[5]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[6]  Alex Alves Freitas,et al.  Evolutionary Design of Decision-Tree Algorithms Tailored to Microarray Gene Expression Data Sets , 2014, IEEE Transactions on Evolutionary Computation.

[7]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[8]  Vadlamani Ravi,et al.  A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance , 2015, Eng. Appl. Artif. Intell..

[9]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[10]  Vadlamani Ravi,et al.  Detection of financial statement fraud and feature selection using data mining techniques , 2011, Decis. Support Syst..

[11]  Monique Snoeck,et al.  APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions , 2015, Decis. Support Syst..

[12]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[13]  Ekrem Duman,et al.  A cost-sensitive decision tree approach for fraud detection , 2013, Expert Syst. Appl..

[14]  Alair Pereira do Lago,et al.  Credit Card Fraud Detection with Artificial Immune System , 2008, ICARIS.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[17]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[18]  Gonzalo Álvarez,et al.  A new taxonomy of Web attacks suitable for efficient encoding , 2003, Comput. Secur..

[19]  Ekrem Duman,et al.  Detecting credit card fraud by genetic algorithm and scatter search , 2011, Expert Syst. Appl..

[20]  Alex Alves Freitas,et al.  Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms , 2013, Genetic Programming and Evolvable Machines.

[21]  Yijing Li,et al.  Learning from class-imbalanced data: Review of methods and applications , 2017, Expert Syst. Appl..

[22]  Manoj Kumar Tiwari,et al.  Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method , 2012, Expert Syst. Appl..

[23]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[24]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[25]  Aderemi Oluyinka Adewumi,et al.  A survey of machine-learning and nature-inspired based credit card fraud detection techniques , 2016, International Journal of System Assurance Engineering and Management.

[26]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[27]  Qiang Shen,et al.  Learning Bayesian networks: approaches and issues , 2011, The Knowledge Engineering Review.

[28]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[29]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[30]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[31]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[32]  Mohammad Kazem Akbari,et al.  A novel model for credit card fraud detection using Artificial Immune Systems , 2014, Appl. Soft Comput..

[33]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[34]  Liqing Zhang,et al.  Credit Card Fraud Detection Using Convolutional Neural Networks , 2016, ICONIP.

[35]  Krzysztof J. Cios,et al.  New synthesis of bayesian network classifiers and cardiac spect image interpretation , 1999 .