Feature selection in corporate credit rating prediction

Credit rating assessment is a complicated process in which many parameters describing a company are taken into consideration and a grade is assigned, which represents the reliability of a potential client. Such assessment is expensive, because domain experts have to be employed to perform the rating. One way of lowering the costs of performing the rating is to use an automated rating procedure. In this paper, we assess several automatic classification methods for credit rating assessment. The methods presented in this paper follow a well-known paradigm of supervised machine learning, where they are first trained on a dataset representing companies with a known credibility, and then applied to companies with unknown credibility. We employed a procedure of feature selection that improved the accuracy of the ratings obtained as a result of classification. In addition, feature selection reduced the number of parameters describing a company that have to be known before the automatic rating can be performed. Wrappers performed better than filters for both US and European datasets. However, better classification performance was achieved at a cost of additional computational time. Our results also suggest that US rating methodology prefers the size of companies and market value ratios, whereas the European methodology relies more on profitability and leverage ratios.

[1]  Anthony Brabazon,et al.  Credit Classification Using Grammatical Evolution , 2006, Informatica.

[2]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[3]  Ching-Chiang Yeh,et al.  A hybrid KMV model, random forests and rough set theory approach for credit rating , 2012, Knowl. Based Syst..

[4]  Petr Hájek,et al.  Credit rating analysis using adaptive fuzzy rule-based systems: an industry-specific approach , 2012, Central Eur. J. Oper. Res..

[5]  Robert P. W. Duin,et al.  Pairwise feature evaluation for constructing reduced representations , 2007, Pattern Analysis and Applications.

[6]  Krzysztof Michalak,et al.  Correlation based feature selection method , 2010, Int. J. Bio Inspired Comput..

[7]  Shian-Chang Huang,et al.  Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting , 2009, Expert Syst. Appl..

[8]  Ingoo Han,et al.  Combining Pairwise SVM Classifiers for Bond Rating , 2005 .

[9]  M. Esmel ElAlami A filter model for feature subset selection based on genetic algorithm , 2009, Knowl. Based Syst..

[10]  Urbano Nunes,et al.  Novel Maximum-Margin Training Algorithms for Supervised Neural Networks , 2010, IEEE Transactions on Neural Networks.

[11]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[12]  Walter Orth,et al.  The Predictive Accuracy of Credit Ratings: Measurement and Statistical Inference , 2011 .

[13]  Richard Weber,et al.  A wrapper method for feature selection using Support Vector Machines , 2009, Inf. Sci..

[14]  Paul Mizen,et al.  Forecasting US bond default ratings allowing for previous and initial state dependence in an ordered probit model , 2012 .

[15]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[16]  Lijuan Cao,et al.  Bond rating using support vector machine , 2006, Intell. Data Anal..

[17]  Ana Paula Matias Gama,et al.  Credit risk assessment and the impact of the New Basel Capital Accord on small and medium‐sized enterprises: An empirical analysis , 2012 .

[18]  Jia Shi,et al.  A Corporate Credit Rating Model Using Support Vector Domain Combined with Fuzzy Clustering Algorithm , 2012 .

[19]  Masanori Nakamura,et al.  Appraisal of companies with Bayesian networks , 2006, Int. J. Bus. Intell. Data Min..

[20]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[21]  Young-Chan Lee,et al.  Application of support vector machines to corporate credit rating prediction , 2007, Expert Syst. Appl..

[22]  Soushan Wu,et al.  Credit rating analysis with support vector machines and neural networks: a market comparative study , 2004, Decis. Support Syst..

[23]  E. Altman,et al.  The Effects of Rating Through the Cycle on Rating Stability, Rating Timeliness and Default Prediction Performance , 2005 .

[24]  Kyungsup Kim,et al.  The cluster-indexing method for case-based reasoning using self-organizing maps and learning vector quantization for bond rating cases , 2001, Expert Syst. Appl..

[25]  Elena Kalotychou,et al.  Credit Rating Migration Risk and Business Cycles , 2011 .

[26]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[27]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[28]  Jacob Scharcanski,et al.  An evolutionary wrapper for feature selection in face recognition applications , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[29]  Chih-Fong Tsai,et al.  Feature selection in bankruptcy prediction , 2009, Knowl. Based Syst..

[30]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[31]  Kyoung-jae Kim,et al.  A corporate credit rating model using multi-class support vector machines with an ordinal pairwise partitioning approach , 2012, Comput. Oper. Res..

[32]  Robert P. W. Duin,et al.  Pairwise Selection of Features and Prototypes , 2005, CORES.

[33]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[34]  Anthony Brabazon,et al.  Corporate Bond Rating Using Neural Networks , 2004, IC-AI.

[35]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[36]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[37]  P. Hajek,et al.  Credit Rating Modelling by Neural Networks , 2010 .

[38]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[39]  C. K. Chu,et al.  Predicting issuer credit ratings using a semiparametric method , 2010 .

[40]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[41]  Lenka Lhotská,et al.  Wrapper feature selection for small sample size data driven by complete error estimates , 2012, Comput. Methods Programs Biomed..

[42]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[43]  Bart Baesens,et al.  Credit rating prediction using Ant Colony Optimization , 2010, J. Oper. Res. Soc..

[44]  Jae Kwon Bae,et al.  Combining models from neural networks and inductive learning algorithms , 2011, Expert Syst. Appl..

[45]  N. Ramaraj,et al.  A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm , 2010, Knowl. Based Syst..

[46]  Hyunchul Ahn,et al.  Corporate Credit Rating using Multiclass Classification Models with order Information , 2011 .

[47]  Bo K. Wong,et al.  A bibliography of neural network business applications research: 1994-1998 , 2000, Comput. Oper. Res..

[48]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[49]  Amparo Alonso-Betanzos,et al.  Filter Methods for Feature Selection - A Comparative Study , 2007, IDEAL.

[50]  Edward R. Dougherty,et al.  Performance of feature-selection methods in the classification of high-dimension data , 2009, Pattern Recognit..

[51]  Craig Valli,et al.  A Wrapper-Based Feature Selection for Analysis of Large Data Sets , 2010 .

[52]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[53]  Ingoo Han,et al.  A case-based approach using inductive indexing for corporate bond rating , 2001, Decis. Support Syst..

[54]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..

[55]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[56]  Petr Hájek,et al.  Municipal credit rating modelling by neural networks , 2011, Decis. Support Syst..

[57]  Ramazan Aktas,et al.  Prediction of bank financial strength ratings: The case of Turkey , 2012 .

[58]  Ricardo Massa Ferreira Lima,et al.  GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation , 2010, Inf. Softw. Technol..

[59]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[60]  Kee S. Kim,et al.  Predicting bond ratings using publicly available information , 2005, Expert Syst. Appl..

[61]  Petr Hájek,et al.  Credit rating modelling by kernel-based approaches with supervised and semi-supervised learning , 2011, Neural Computing and Applications.

[62]  Cheng-Few Lee,et al.  On multiple-class prediction of issuer credit ratings , 2009 .

[63]  Krzysztof Michalak,et al.  CORRELATION-BASED FEATURE SELECTION STRATEGY IN CLASSIFICATION PROBLEMS , 2006 .

[64]  Trevor Hastie,et al.  Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .