Orthogonal support vector machine for credit scoring

The most commonly used techniques for credit scoring is logistic regression, and more recent research has proposed that the support vector machine is a more effective method. However, both logistic regression and support vector machine suffers from curse of dimension. In this paper, we introduce a new way to address this problem which is defined as orthogonal dimension reduction. We discuss the related properties of this method in detail and test it against other common statistical approaches-principal component analysis and hybridizing logistic regression to better solve and evaluate the data. With experiments on German data set, there is also an interesting phenomenon with respect to the use of support vector machine, which we define as 'Dimensional interference', and discuss in general. Based on the results of cross-validation, it can be found that through the use of logistic regression filtering the dummy variables and orthogonal extracting feature, the support vector machine not only reduces complexity and accelerates convergence, but also achieves better performance.

[1]  Han Liyan,et al.  Combined model of empirical study for credit risk management , 2010, 2010 2nd IEEE International Conference on Information and Financial Engineering.

[2]  Asoke K. Nandi,et al.  Practical scheme for fast detection and classification of rolling-element bearing faults using support vector machines , 2006 .

[3]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[4]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[5]  J. Wiginton A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior , 1980, Journal of Financial and Quantitative Analysis.

[6]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[7]  Michael Y. Hu,et al.  Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis , 1999, Eur. J. Oper. Res..

[8]  Kin Keung Lai,et al.  Bio-Inspired Credit Risk Analysis , 2008 .

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  Zhongsheng Hua,et al.  Predicting corporate financial distress based on integration of support vector machine and logistic regression , 2007, Expert Syst. Appl..

[11]  Kin Keung Lai,et al.  A Reliability-Based RBF Network Ensemble Model for Foreign Exchange Rates Predication , 2006, ICONIP.

[12]  S. K. Jain,et al.  Linear Algebra: An Interactive Approach , 2003 .

[13]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[14]  Chandrasekhar Nataraj,et al.  Use of particle swarm optimization for machinery fault detection , 2009, Eng. Appl. Artif. Intell..

[15]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[16]  Daniel Martin,et al.  Early warning of bank failure: A logit regression approach , 1977 .

[17]  Li Liu,et al.  Coupling of logistic regression analysis and local search methods for characterization of water distribution system contaminant source , 2012, Eng. Appl. Artif. Intell..

[18]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[19]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[20]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[21]  Shian-Chang Huang,et al.  Integrating nonlinear graph based dimensionality reduction schemes with SVMs for credit rating forecasting , 2009, Expert Syst. Appl..

[22]  Ralf Stecking,et al.  Support vector machines for classifying and describing credit applicants: detecting typical and critical regions , 2005, J. Oper. Res. Soc..

[23]  Sheng-Fa Yuan,et al.  Fault diagnosis based on support vector machines with parameter optimisation by artificial immunisation algorithm , 2007 .

[24]  Robert A. McLean,et al.  Credit Risk Measurement: Developments over the Last 20 Years , 1998 .

[25]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[26]  B. Baesens,et al.  A support vector machine approach to credit scoring , 2003 .

[27]  Asoke K. Nandi,et al.  FAULT DETECTION USING SUPPORT VECTOR MACHINES AND ARTIFICIAL NEURAL NETWORKS, AUGMENTED BY GENETIC ALGORITHMS , 2002 .

[28]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[29]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[30]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[31]  I. Jolliffe Principal Component Analysis , 2002 .

[32]  Yingxu Yang,et al.  Adaptive credit scoring with kernel learning methods , 2007, Eur. J. Oper. Res..

[33]  Pedro Antonio Gutiérrez,et al.  Hybridizing logistic regression with product unit and RBF networks for accurate detection and prediction of banking crises , 2010, Omega.

[34]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[35]  Jonathan N. Crook,et al.  Credit Scoring and Its Applications , 2002, SIAM monographs on mathematical modeling and computation.