Detecting Fraud Using Transaction Frequency Data

Despite all attempts to prevent fraud, it continues to be a major threat to industry and government. In this paper, we present a fraud detection method which detects irregular frequency of transaction usage in an Enterprise Resource Planning (ERP) system. We discuss the design, development and empirical evaluation of outlier detection and distance measuring techniques to detect frequency-based anomalies within an individual user’s profile, relative to other similar users. Primarily, we propose three automated techniques: a univariate method, called Boxplot which is based on the sample’s median; and two multivariate methods which use Euclidean distance, for detecting transaction frequency anomalies within each transaction profile. The two multivariate approaches detect potentially fraudulent activities by identifying: (1) users where the Euclidean distance between their transaction-type set is above a certain threshold and (2) users/data points that lie far apart from other users/clusters or represent a small cluster size, using k-means clustering. The proposed methodology allows an auditor to investigate the transaction frequency anomalies and adjust the different parameters, such as the outlier threshold and the Euclidean distance threshold values to tune the number of alerts. The novelty of the proposed technique lies in its ability to automatically trigger alerts from transaction profiles, based on transaction usage performed over a period of time. Experiments were conducted using a real dataset obtained from the production client of a large organization using SAP R/3 (presently the most predominant ERP system), to run its business. The results of this empirical research demonstrate the effectiveness of the proposed approach.

[1]  George M. Mohay,et al.  Transaction mining for fraud detection in ERP Systems , 2010 .

[2]  Prasad Bingi,et al.  Critical Issues Affecting an ERP Implementation , 1999, Inf. Syst. Manag..

[3]  Peter A. Flach,et al.  Machine Learning - The Art and Science of Algorithms that Make Sense of Data , 2012 .

[4]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[5]  E. Acuña,et al.  A Meta analysis study of outlier detection methods in classification , 2004 .

[6]  Yusufali F. Musaji Integrated Auditing of ERP Systems , 2002 .

[7]  John D. O'Gara Corporate Fraud: Case Studies in Detection and Prevention , 2004 .

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  Lior Rokach,et al.  Data Mining and Knowledge Discovery Handbook, 2nd ed , 2010, Data Mining and Knowledge Discovery Handbook, 2nd ed..

[10]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[11]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[12]  R. Shiffler Maximum Z Scores and Outliers , 1988 .

[13]  F. Mosteller,et al.  Understanding robust and exploratory data analysis , 1985 .

[14]  Martti Juhola,et al.  Informal identification of outliers in medical data , 2000 .

[15]  Xin Jin,et al.  K-Means Clustering , 2010, Encyclopedia of Machine Learning.

[16]  Conan C. Albrecht,et al.  Current Trends in Fraud and its Detection , 2008, Inf. Secur. J. A Glob. Perspect..

[17]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[18]  Joseph T. Wells Fraud casebook : lessons from the bad side of business , 2012 .

[19]  Won Suk Lee,et al.  An anomaly intrusion detection method by clustering normal user behavior , 2003, Comput. Secur..

[20]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[21]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[22]  Peter Best,et al.  A Framework for Separation of Duties in an SAP R/3 Environment , 2003 .