Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance

Supply chain finance (SCF) becomes more important for small- and medium-sized enterprises (SMEs) due to global credit crunch, supply chain financing woes and tightening credit criteria for corporate lending. Currently, predicting SME credit risk is significant for guaranteeing SCF in smooth operation. In this paper, we apply six methods, i.e., one individual machine learning (IML, i.e., decision tree) method, three ensemble machine learning methods [EML, i.e., bagging, boosting, and random subspace (RS)], and two integrated ensemble machine learning methods (IEML, i.e., RS–boosting and multi-boosting), to predict SMEs credit risk in SCF and compare the effectiveness and feasibility of six methods. In the experiment, we choose the quarterly financial and non-financial data of 48 listed SMEs from Small and Medium Enterprise Board of Shenzhen Stock Exchange, six listed core enterprises (CEs) from Shanghai Stock Exchange and three listed CEs from Shenzhen Stock Exchange during the period of 2012–2013 as the empirical samples. Experimental results reveal that the IEML methods acquire better performance than IML and EML method. In particular, RS–boosting is the best method to predict SMEs credit risk among six methods.

[1]  Chih-Fong Tsai,et al.  Using neural network ensembles for bankruptcy prediction and credit scoring , 2008, Expert Syst. Appl..

[2]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[3]  Ralf W. Seifert,et al.  Financing the Chain , 2011 .

[4]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[5]  Jian Ma,et al.  Study of corporate credit risk prediction based on integrating boosting and random subspace , 2011, Expert Syst. Appl..

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[9]  Michael Henke,et al.  Focusing the financial flow of supply chains: An empirical investigation of financial supply chain management , 2013 .

[10]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[11]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[12]  Erik Hofmann,et al.  Supply Chain Finance Solutions: Relevance - Propositions - Market Value , 2011 .

[13]  Yi Jiang Credit Scoring Model Based on the Decision Tree and the Simulated Annealing Algorithm , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[14]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[16]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[17]  Alea M. Fairchild Intelligent matching: integrating efficiencies in the financial supply chain , 2005 .

[18]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[19]  S. Raghavan,et al.  Diversification for better classification trees , 2006, Comput. Oper. Res..

[20]  Bee Wah Yap,et al.  Using data mining to improve assessment of credit worthiness via credit scoring models , 2011, Expert Syst. Appl..

[21]  Hussain Ali Bekhet,et al.  Credit risk assessment model for Jordanian commercial banks : neural scoring approach , 2014 .

[22]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[23]  Soushan Wu,et al.  Credit rating analysis with support vector machines and neural networks: a market comparative study , 2004, Decis. Support Syst..

[24]  Gerhard-Wilhelm Weber,et al.  A classification problem of credit risk rating investigated and solved by optimisation of the ROC curve , 2012, Central Eur. J. Oper. Res..

[25]  Ben Sopranzetti Selling accounts receivable and the underinvestment problem , 1999 .

[26]  Preetam Basu,et al.  Challenges of supply chain finance: A detailed study and a hierarchical model based on the experiences of an Indian firm , 2013, Bus. Process. Manag. J..

[27]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[28]  Loris Nanni,et al.  An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring , 2009, Expert Syst. Appl..