Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection

Financial fraud is a threat which is increasing on a greater pace and has a very bad impact over the economy, collaborative institutions and administration. Credit card transactions are increasing faster because of the advancement in internet technology which leads to high dependence over internet. With the up-gradation of technology and increase in usage of credit cards, fraud rates become challenge for economy. With inclusion of new security features in credit card transactions the fraudsters are also developing new patterns or loopholes to chase the transactions. As a result of which behavior of frauds and normal transactions change constantly. Also the problem with the credit card data is that it is highly skewed which leads to inefficient prediction of fraudulent transactions. In order to achieve the better result, imbalanced or skewed data is pre-processed with the re-sampling (over-sampling or under sampling) technique for better results. The three different proportions of datasets were used in this study and random under-sampling technique was used for skewed dataset. This work uses the three machine learning algorithms namely: logistic regression, Naïve Bayes and K-nearest neighbour. The performance of these algorithms is recorded with their comparative analysis. The work is implemented in python and the performance of the algorithms is measured based on accuracy, sensitivity, specificity, precision, F-measure and area under curve. On the basis these measurements logistic regression based model for prediction of fraudulent was found to be a better in comparison to other prediction models developed from Naïve Bayes and K-nearest neighbour. Better results are also seen by applying under sampling techniques over the data before developing the prediction model.

[1]  Suchita Anant Padvekar,et al.  Credit Card Fraud Detection System , 2016 .

[2]  Chaitanya Ghorpade,et al.  Credit Card Fraud Detection on the Skewed Data Using Various Classification and Ensemble Techniques , 2018, 2018 IEEE International Students' Conference on Electrical, Electronics and Computer Science (SCEECS).

[3]  M. Pushpa,et al.  Analysis on credit card fraud identification techniques based on KNN and outlier detection , 2017, 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB).

[4]  Changjun Jiang,et al.  Random forest for credit card fraud detection , 2018, 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC).

[5]  Mohsen Rohani,et al.  Cost sensitive modeling of credit card fraud using neural network strategy , 2016, 2016 2nd International Conference of Signal Processing and Intelligent Systems (ICSPIS).

[6]  Samuel A. Oluwadare,et al.  Credit card fraud detection using machine learning techniques: A comparative analysis , 2017, 2017 International Conference on Computing Networking and Informatics (ICCNI).

[7]  Shamik Sural,et al.  BLAST-SSAHA Hybridization for Credit Card Fraud Detection , 2009, IEEE Transactions on Dependable and Secure Computing.

[8]  Ping Zhu,et al.  An Ensemble Learning Framework for Credit Card Fraud Detection Based on Training Set Partitioning and Clustering , 2018, 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[9]  Steven Chen,et al.  Comparative Analysis of Machine Learning Algorithms through Credit Card Fraud Detection , 2018, 2018 IEEE MIT Undergraduate Research Technology Conference (URTC).

[10]  Siti Mariyam Shamsuddin,et al.  Handling Class Imbalance in Credit Card Fraud using Resampling Methods , 2018 .