Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs

Abstract Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions. In this framework, we model a sequence of credit card transactions from three different perspectives, namely (i) The sequence contains or doesn’t contain a fraud (ii) The sequence is obtained by fixing the card-holder or the payment terminal (iii) It is a sequence of spent amount or of elapsed time between the current and previous transactions. Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sequences is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection. Our multiple perspectives HMM-based approach offers automated feature engineering to model temporal correlations so as to improve the effectiveness of the classification task and allows for an increase in the detection of fraudulent transactions when combined with the state of the art expert based feature engineering strategy for credit card fraud detection. In extension to previous works, we show that this approach goes beyond ecommerce transactions and provides a robust feature engineering over different datasets, hyperparameters and classifiers. Moreover, we compare strategies to deal with structural missing values.

[1]  Christian W. Omlin,et al.  Credit Card Transactions, Fraud Detection, and Machine Learning: Modelling Time with LSTM Recurrent Neural Networks , 2009, Innovations in Neural Information Paradigms and Applications.

[2]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[3]  Monique Snoeck,et al.  APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions , 2015, Decis. Support Syst..

[4]  Djamila Aouada,et al.  Feature engineering strategies for credit card fraud detection , 2016, Expert Syst. Appl..

[5]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[6]  Roberto Saia,et al.  Evaluating the benefits of using proactive transformed-domain-based techniques in fraud detection tasks , 2019, Future Gener. Comput. Syst..

[7]  Niall M. Adams,et al.  Transaction aggregation as a strategy for credit card fraud detection , 2009, Data Mining and Knowledge Discovery.

[8]  Michael Granitzer,et al.  Sequence classification for credit-card fraud detection , 2018, Expert Syst. Appl..

[9]  Tatsuya Minegishi,et al.  Proposal of Credit Card Fraudulent Use Detection by Online-type Decision Tree Construction and Verification of Generality , 2013 .

[10]  Vipin Kumar,et al.  Anomaly Detection for Discrete Sequences: A Survey , 2012, IEEE Transactions on Knowledge and Data Engineering.

[11]  Siddhartha Bhattacharyya,et al.  Data mining for credit card fraud: A comparative study , 2011, Decis. Support Syst..

[12]  Léa Laporte,et al.  Multiple perspectives HMM-based feature engineering for credit card fraud detection , 2019, SAC.

[13]  Richard L. Schmoyer,et al.  Mining multi-dimensional data for decision support , 1999, Future Gener. Comput. Syst..

[14]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[15]  Feng Hao,et al.  Consumer-facing technology fraud: Economics, attack methods and potential solutions , 2019, Future Gener. Comput. Syst..

[16]  Jean Arlat,et al.  IEEE Transactions on Dependable and Secure Computing , 2006 .

[17]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[18]  J. Christopher Westland,et al.  Employing transaction aggregation strategy to detect credit card fraud , 2012, Expert Syst. Appl..

[19]  Björn E. Ottersten,et al.  Cost Sensitive Credit Card Fraud Detection Using Bayes Minimum Risk , 2013, 2013 12th International Conference on Machine Learning and Applications.

[20]  Gianluca Bontempi,et al.  Learned lessons in credit card fraud detection from a practitioner perspective , 2014, Expert Syst. Appl..

[21]  Cesare Alippi,et al.  Credit card fraud detection and concept-drift adaptation with delayed supervised information , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[22]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[23]  Ekrem Duman,et al.  Detecting credit card fraud by Modified Fisher Discriminant Analysis , 2015, Expert Syst. Appl..

[24]  M. Krivko,et al.  A hybrid model for plastic card fraud detection systems , 2010, Expert Syst. Appl..

[25]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[26]  Abhinav Srivastava,et al.  Credit Card Fraud Detection Using Hidden Markov Model , 2008, IEEE Transactions on Dependable and Secure Computing.

[27]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[28]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[30]  Cesare Alippi,et al.  Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.