Predicting corporate financial distress based on integration of decision tree classification and logistic regression

Lately, stock and derivative securities markets continuously and rapidly evolve in the world. As quick market developments, enterprise operating status will be disclosed periodically on financial statement. Unfortunately, if executives of firms intentionally dress financial statements up, it will not be observed any financial distress possibility in the short or long run. Recently, there were occurred many financial crises in the international marketing, such as Enron, Kmart, Global Crossing, WorldCom and Lehman Brothers events. How these financial events affect world's business, especially for the financial service industry or investors has been public's concern. To improve the accuracy of the financial distress prediction model, this paper referred to the operating rules of the Taiwan Stock Exchange Corporation (TSEC) and collected 100 listed companies as the initial samples. Moreover, the empirical experiment with a total of 37 ratios which composed of financial and other non-financial ratios and used principle component analysis (PCA) to extract suitable variables. The decision tree (DT) classification methods (C5.0, CART, and CHAID) and logistic regression (LR) techniques were used to implement the financial distress prediction model. Finally, the experiments acquired a satisfying result, which testifies for the possibility and validity of our proposed methods for the financial distress prediction of listed companies. This paper makes four critical contributions: (1) the more PCA we used, the less accuracy we obtained by the DT classification approach. However, the LR approach has no significant impact with PCA; (2) the closer we get to the actual occurrence of financial distress, the higher the accuracy we obtain in DT classification approach, with an 97.01% correct percentage for 2 seasons prior to the occurrence of financial distress; (3) our empirical results show that PCA increases the error of classifying companies that are in a financial crisis as normal companies; and (4) the DT classification approach obtains better prediction accuracy than the LR approach in short run (less one year). On the contrary, the LR approach gets better prediction accuracy in long run (above one and half year). Therefore, this paper proposes that the artificial intelligent (AI) approach could be a more suitable methodology than traditional statistics for predicting the potential financial distress of a company in short run.

[1]  Hui Li,et al.  Majority voting combination of multiple case-based reasoning for financial distress prediction , 2009, Expert Syst. Appl..

[2]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[3]  Thomas E. McKee,et al.  Genetic programming and rough sets: A hybrid approach to bankruptcy classification , 2002, Eur. J. Oper. Res..

[4]  Thomas E. McKee,et al.  Bankruptcy theory development and classification via genetic programming , 2006, Eur. J. Oper. Res..

[5]  Zhongsheng Hua,et al.  Predicting corporate financial distress based on integration of support vector machine and logistic regression , 2007, Expert Syst. Appl..

[6]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[7]  William R. Kinney,et al.  Characteristics of firms correcting previously reported quarterly earnings , 1989 .

[8]  C. Zopounidis,et al.  Detecting falsified financial statements: a comparative study using multicriteria analysis and multivariate statistical techniques , 2002 .

[9]  Kimon P. Valavanis,et al.  Forecasting stock market short-term trends using a neuro-fuzzy based methodology , 2009, Expert Syst. Appl..

[10]  Kostas S. Metaxiotis,et al.  On the selection of equity securities: An expert systems methodology and an application on the Athens Stock Exchange , 2009, Expert Syst. Appl..

[11]  Obeua S. Persons Using Financial Statement Data To Identify Factors Associated With Fraudulent Financial Reporting , 2011 .

[12]  Chih-Hao Chen,et al.  Applying decision tree and neural network to increase quality of dermatologic diagnosis , 2009, Expert Syst. Appl..

[13]  Ayse Canan Yazici,et al.  Comparison of logistic regression model and classification tree: An application to postpartum depression data , 2007, Expert Syst. Appl..

[14]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[15]  E. Laitinen,et al.  Bankruptcy prediction: Application of the Taylor's expansion in logistic regression , 2000 .

[16]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[19]  Wei-Sen Chen,et al.  Using neural networks and data mining techniques for the financial distress prediction model , 2009, Expert Syst. Appl..

[20]  Richard J. Roiger,et al.  Data Mining: A Tutorial Based Primer , 2002 .

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Vadlamani Ravi,et al.  Bankruptcy prediction in banks and firms via statistical and intelligent techniques - A review , 2007, Eur. J. Oper. Res..

[23]  Kurt Fanning,et al.  Neural Network Detection of Management Fraud Using Published Financial Data , 1998 .

[24]  P. Meyer,et al.  PREDICTION OF BANK FAILURES , 1970 .

[25]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[26]  Constantin Zopounidis,et al.  A survey of business failures with an emphasis on prediction methods and industrial applications , 1996 .

[27]  Constantin Zopounidis,et al.  Multicriteria Decision Aid Methods for the Prediction of Business Failure , 1998 .

[28]  Mu-Yen Chen,et al.  Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis , 2007, Expert Syst. Appl..

[29]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[30]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[31]  Marc Blum FAILING COMPANY DISCRIMINANT-ANALYSIS , 1974 .

[32]  E. Altman,et al.  ZETATM analysis A new model to identify bankruptcy risk of corporations , 1977 .