Spotting Earnings Manipulation: Using Machine Learning for Financial Fraud Detection

Earnings manipulation and accounting fraud leads to reduced firm valuation in the long run and a public distrust in the company and its management. Yet, manipulation of accruals to hide liabilities and inflate earnings has been a long-standing fraudulent conduct amongst many listed firms. As auditing is time consuming and restricted to a sample of entries, fraud is either not detected or detected belatedly. We believe that supervised machine learning models can be used to determine high risk firms early enough for auditing by the regulator. We also discuss the anomaly detection unsupervised learning methodology. Since the proportion of manipulators is much lower than the non-manipulators, the biggest challenge in predicting earnings manipulation is the imbalance in the data leading to biased results for conventional statistical models. In this paper, we build ensemble models to detect accrual manipulation by borrowing theory from the seminal work done by Beneish. We also showcase a novel simulation-based sampling technique to efficiently handle imbalanced dataset and illustrate our results on data from listed Indian firms. We compare existing ensemble models establishing the superiority of fairly simple boosting models whilst commenting on the shortfall of area under ROC curve as a performance metric for imbalanced datasets. The paper makes two major contributions: (i) a functional contribution of suggesting an easily deployable strategy to identify high risk companies; (ii) a methodological contribution of suggesting a simulation-based sampling approach that can be applied in other cases of highly imbalanced data for utilizing the entire dataset in modeling.

[1]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Katherine A. Gunny,et al.  The Relation Between Earnings Management Using Real Activities Manipulation and Future Performance: Evidence from Meeting Earnings Benchmarks*: Real Activities Manipulation and Future Performance , 2010 .

[4]  Amy Y. Zang,et al.  Evidence on the Trade-Off between Real Activities Manipulation and Accrual-Based Earnings Management , 2011 .

[5]  Michael D. Pfarrer,et al.  CEOS ON THE EDGE: EARNINGS MANIPULATION AND STOCK-BASED INCENTIVE MISALIGNMENT , 2008 .

[6]  M. D. Beneish,et al.  Earnings Management: A Perspective , 2001 .

[7]  Youn Min Chou,et al.  Transforming Non-Normal Data to Normality in Statistical Process Control , 1998 .

[8]  Ross L. Watts,et al.  Positive Accounting Theory , 2006 .

[9]  Karla M. Zehms,et al.  Earnings Manipulation Risk, Corporate Governance Risk, and Auditors' Planning and Pricing Decisions , 2004 .

[10]  Patricia M. Dechow,et al.  DETECTING EARNINGS MANAGEMENT , 1994 .

[11]  Michael T. Dugan,et al.  Review of Real Earnings Management Literature , 2007 .

[12]  Patricia M. Dechow,et al.  Causes and Consequences of Earnings Manipulation: An Analysis of Firms Subject to Enforcement Actions by the SEC* , 1996 .

[13]  Katherine A. Gunny,et al.  The Relation between Earnings Management Using Real Activities Manipulation and Future Performance: Evidence from Meeting Earnings Benchmarks , 2009 .

[14]  Amy Y. Zang,et al.  Evidence on the Tradeoff Between Real Manipulation and Accrual Manipulation , 2007 .

[15]  Mikhail Pevzner,et al.  Is Enhanced Audit Quality Associated with Greater Real Earnings Management? , 2010 .

[16]  D. BeneishMessod,et al.  The Detection of Earnings Manipulation , 1999 .