Modeling Insurance Fraud Detection Using Ensemble Combining Classification

This paper is a continuation of previous paper where the imbalance dataset problem was solved by applying a proposed novel partitioning-undersampling technique. Then a proposed innovative Insurance Fraud Detection (IFD) models were designed using base-classifiers; Decision Tree, Support Vector Machine and Artificial Neural Network. This paper proposed an innovative insurance fraud detection models by applying ensemble combining classifiers on IFD models designed previously using base-classifiers. Throughout the paper, ten-fold cross validation method of testing is used. Its originality lies in the use of several ensembles combining classifier and comparing between them for choosing the best model. Results from a publicly available automobile insurance fraud detection dataset demonstrate that DTIFD performs slightly better than all proposed models, ensemble combining classifier designed IFD models with high recall but still DTIFD model was the best. The proposed models were applied on another imbalance datasets and compared. Empirical results illustrate that the proposed models gave better results.

[1]  RadhaKanta Mahapatra,et al.  Business data mining - a machine learning perspective , 2001, Inf. Manag..

[2]  Guido Dedene,et al.  Strategies for detecting fraudulent claims in the automobile insurance industry , 2007, Eur. J. Oper. Res..

[3]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..

[4]  Jau-Hwang Wang,et al.  Technology-based Financial Frauds in Taiwan: Issues and Approaches , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[5]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[6]  Alexander K. Seewald,et al.  How to Make Stacking Better and Faster While Also Taking Care of an Unknown Weakness , 2002, International Conference on Machine Learning.

[7]  Johannes Fürnkranz,et al.  An Evaluation of Grading Classifiers , 2001, IDA.

[8]  Vadlamani Ravi,et al.  A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance , 2015, Eng. Appl. Artif. Intell..

[9]  Vadlamani Ravi,et al.  A hybrid under-sampling approach for mining unbalanced datasets: applications to banking and insurance , 2011, Int. J. Data Min. Model. Manag..

[10]  Mercedes Ayuso,et al.  Modelling different types of automobile insurance fraud behaviour in the Spanish market , 1999 .

[11]  Mo-Yuen Chow,et al.  A classification approach for power distribution systems fault cause identification , 2006, IEEE Transactions on Power Systems.

[12]  Judy Pearsall,et al.  The Concise Oxford Dictionary , 1999 .

[13]  Guido Dedene,et al.  A Comparison of State-of-The-Art Classification Techniques for Expert Automobile Insurance Claim Fraud Detection , 2002 .

[14]  Ajith Abraham,et al.  Modeling Insurance Fraud Detection Using Imbalanced Data Classification , 2015, NaBIC.

[15]  Zhengxin Chen,et al.  Application of Clustering Methods to Health Insurance Fraud Detection , 2006, 2006 International Conference on Service Systems and Service Management.

[16]  Montserrat Guillen,et al.  Selection Bias and Auditing Policies for Insurance Claims , 2007 .

[17]  Sharon Tennyson,et al.  Claims Auditing in Automobile Insurance: Fraud Detection and Deterrence Objectives , 2002 .

[18]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[19]  Mercedes Ayuso,et al.  Detection of Automobile Insurance Fraud with Discrete Choice Models and Misclassified Claims , 2002 .

[20]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[22]  Robert G. Reynolds,et al.  Using cultural algorithms to support re-engineering of rule-based expert systems in dynamic performance environments: a case study in fraud detection , 1997, IEEE Trans. Evol. Comput..

[23]  V. Ravi,et al.  Analytical CRM in banking and finance using SVM: a modified active learning-based rule extraction approach , 2012 .

[24]  P. Brockett,et al.  Using Kohonen's Self-Organizing Feature Map to Uncover Automobile Bodily Injury Claims Fraud , 1998 .

[25]  Olatz Arbelaitz,et al.  Consolidated Tree Classifier Learning in a Car Insurance Fraud Detection Domain with Class Imbalance , 2005, ICAPR.

[26]  Patrick L. Brockett,et al.  Fraud Classification Using Principal Component Analysis of Ridits , 2002 .

[27]  Cullen Schaffer,et al.  Selecting a classification method by cross-validation , 1993, Machine Learning.

[28]  Lior Rokach,et al.  Ensemble Methods for Classifiers , 2005, The Data Mining and Knowledge Discovery Handbook.

[29]  Nada Lavrac,et al.  Introduction: Lessons Learned from Data Mining Applications and Collaborative Problem Solving , 2004, Machine Learning.

[30]  Chang-Tien Lu,et al.  Survey of fraud detection techniques , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.

[31]  Guido Dedene,et al.  A case study of applying boosting naive Bayes to claim fraud diagnosis , 2004, IEEE Transactions on Knowledge and Data Engineering.

[32]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[33]  Charles Elkan,et al.  Magical thinking in data mining: lessons from CoIL challenge 2000 , 2001, KDD '01.

[34]  Rekha Bhowmik,et al.  Detecting Auto Insurance Fraud by Data Mining Techniques , 2011 .

[35]  Olatz Arbelaitz,et al.  Coverage-based resampling: Building robust consolidated decision trees , 2015, Knowl. Based Syst..

[36]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[37]  S. Caudill,et al.  Fraud Detection Using a Multinomial Logit Model with Missing Information , 2005 .

[38]  El-Bachir Belhadji,et al.  A Model for the Detection of Insurance Fraud , 2000 .

[39]  Efraim Turban,et al.  Decision Support and Business Intelligence Systems (8th Edition) , 2006 .

[40]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[41]  Guido Dedene,et al.  Auto claim fraud detection using Bayesian learning neural networks , 2005, Expert Syst. Appl..

[42]  Wei Xu,et al.  Random Rough Subspace Based Neural Network Ensemble for Insurance Fraud Detection , 2011, 2011 Fourth International Joint Conference on Computational Sciences and Optimization.

[43]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[44]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[45]  Ajith Abraham,et al.  Computational Intelligence Models for Insurance Fraud Detection : A Review of a Decade of Research , 2014 .

[46]  Navneet Vidyarthi,et al.  A Fuzzy-Based Algorithm for Auditors to Detect Element of Fraud in Settled Insurance Claims , 2003 .