Applying Support Vector Machine, C5.0, and CHAID to the Detection of Financial Statements Frauds

This paper applies support vector machine (SVM), decision tree C5.0, and CHAID to the detection of financial reporting frauds by establishing an effective detection model. The research data covering 2007-2016 is sourced from the Taiwan Economic Journal (TEJ). The sample consists of 28 companies engaged in financial statement frauds and 84 companies not involved in such frauds (at a ratio of 1 to 3), as listed on the Taiwan Stock Exchange and the Taipei Exchange during the research period. This paper selects key variables with SVM and C5.0 before establishing the model with CHAID and SVM. Both financial and non-financial variables are used to enhance the accuracy of the detection model for financial reporting frauds. The research suggests that the C5.0-SVM model yields the highest accuracy rate of 83.15%, followed by SVM-SVM (81.91%), the C5.0-CHAID model (80.93%), and the SVM-CHAID model (77.16%).

[1]  Yongzhao Zhan,et al.  Maximum Neighborhood Margin Discriminant Projection for Classification , 2014, TheScientificWorldJournal.

[2]  Chyan-long Jan An Effective Financial Statements Fraud Detection Model for the Sustainable Development of Financial Markets: Evidence from Taiwan , 2018 .

[3]  Mahdi Salehi,et al.  Data Mining Approach to Prediction of Going Concern Using Classification and Regression Tree (CART) , 2013 .

[4]  M. Beasley An Empirical Analysis of the Relation between Board of Director Composition and Financial Statement Fraud , 1998 .

[5]  Kevin C. Moffitt,et al.  Identification of fraudulent financial statements using linguistic credibility analysis , 2011, Decis. Support Syst..

[6]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[7]  Vadlamani Ravi,et al.  Detection of financial statement fraud and feature selection using data mining techniques , 2011, Decis. Support Syst..

[8]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[9]  Sotiris Kotsiantis,et al.  Forecasting Fraudulent Financial Statements using Data Mining , 2007 .

[10]  Suduan Chen,et al.  Detection of fraudulent financial statements using the hybrid data mining approach , 2016, SpringerPlus.

[11]  Ching-Chiang Yeh,et al.  A Hybrid Detecting Fraudulent Financial Statements Model Using Rough Set Theory and Support Vector Machines , 2016, Cybern. Syst..

[12]  Suduan Chen,et al.  A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements , 2014, TheScientificWorldJournal.