A Comprehensive Survey of Data Mining-Based Accounting-Fraud Detection Research

This survey paper categorizes, compares, and summarizes the data set, algorithm and performance measurement in almost all published technical and review articles in automated accounting fraud detection. Most researches regard fraud companies and non-fraud companies as data subjects, Eigenvalue covers auditor data, company governance data, financial statement data, industries, trading data and other categories. Most data in earlier research were auditor data; Later research establish model by using sharing data and public statement data. Company governance data have been widely used. It is generally believed that ratio data is more effective than accounting data; Seldom research on time Series Data Mining were conducted. The retrieved literature used mining algorithms including statistical test, regression analysis, neural networks, decision tree, Bayesian network, and stack variables etc.. Regression Analysis is widely used on hiding data. Generally the detecting effect and accuracy of NN are superior to regression model. General conclusion is that model detecting is better than auditor detecting rate without assisting. There is a need to introduce other algorithms of no-tag data mining. Owing to the small size of fraud samples, some literature reached conclusion based on training samples and may overestimated the effect of model.

[1]  Ashutosh Deshmukh,et al.  An Analysis of Efficiency and Effectiveness of Auditing to Detect Management Fraud: A Signal Detection Theory Approach , 1998 .

[2]  B. Green,et al.  Assessing the risk of management fraud through neural network technology , 1997 .

[3]  Joseph V. Carcello,et al.  Fraudulent Financial Reporting: Consideration of Industry Traits and Corporate Governance Mechanisms , 2000 .

[4]  M. D. Beneish,et al.  Incentives and Penalties Related to Earnings Overstatements that Violate GAAP , 1999 .

[5]  William F. Messier,et al.  A Generalized Qualitative-Response Model and the Analysis of Management Fraud , 1996 .

[6]  Charles Elkan,et al.  Magical thinking in data mining: lessons from CoIL challenge 2000 , 2001, KDD '01.

[7]  Kenneth O. Cogger,et al.  Neural network detection of management fraud using published financial data , 1998, Intell. Syst. Account. Finance Manag..

[8]  Nada Lavrac,et al.  Introduction: Lessons Learned from Data Mining Applications and Collaborative Problem Solving , 2004, Machine Learning.

[9]  J. Sweeney,et al.  Fraudulently Misstated Financial Statements and Insider Trading: An Empirical Analysis , 1997 .

[10]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[11]  Gary F. Peters,et al.  Audit Committee Characteristics and Financial Misstatement: A Study of the Efficacy of Certain Blue Ribbon Committee Recommendations , 2002 .

[12]  M. Beasley An Empirical Analysis of the Relation between Board of Director Composition and Financial Statement Fraud , 1998 .

[13]  Donald R. Jones,et al.  Reliance on Decision Aids: An Examination of Auditors' Assessment of Management Fraud , 1997 .

[14]  Mark I. Hwang,et al.  A fuzzy neural network for assessing the risk of fraudulent financial reporting , 2003 .

[15]  M. D. Beneish,et al.  Detecting GAAP violation: implications for assessing earnings management among firms with extreme financial performance , 1997 .

[16]  Joseph V. Carcello,et al.  A Decision Aid for Assessing the Likelihood of Fraudulent Financial Reporting , 2000 .

[17]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[18]  Gary F. Peters,et al.  Audit Committee Characteristics and Restatements , 2004 .

[19]  Obeua S. Persons Using Financial Statement Data To Identify Factors Associated With Fraudulent Financial Reporting , 2011 .

[20]  Kurt Fanning,et al.  Neural Network Detection of Management Fraud Using Published Financial Data , 1998 .

[21]  Charalambos Spathis Detecting false financial statements using published data: some evidence from Greece , 2002 .