Resource-Bounded Fraud Detection

This paper describes an approach to fraud detection targeted at applications where this task is followed by a posterior human analysis of the signaled frauds. This is a frequent setup on fraud detection applications (e.g. credit card misuse, telecom fraud, etc.). In real world applications this human inspection is usually constrained by limited resources. In this context, standard fraud detection methods that simply tag each case as being (or not) a possible fraud are not very useful if the number of tagged cases surpasses the available resources. A much more useful approach is to produce a ranking of fraud that can be used to optimize the available inspection resources by first addressing the cases with higher rank. In this paper we propose a method that produces such ranking. The method is based on the output of standard agglomerative hierarchical clustering algorithms, resulting in no significant additional computational costs. Our comparisons with a state of the art method provide convincing evidence of the competitiveness of our proposal.

[1]  Douglas L. Reilly,et al.  Credit card fraud detection with a neural-network , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[2]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[3]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[4]  P. Rousseeuw,et al.  Wiley Series in Probability and Mathematical Statistics , 2005 .

[5]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[6]  Fionn Murtagh,et al.  Multidimensional clustering algorithms , 1985 .

[7]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[8]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[9]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[10]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[11]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[12]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[13]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[14]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[15]  A. Madansky Identification of Outliers , 1988 .

[16]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .