A THREE-STAGE FEATURE SELECTION USING QUADRATIC PROGRAMMING FOR CREDIT SCORING

Many classification techniques have been successfully applied to credit scoring tasks. However, using them blindly may lead to unsatisfactory results. Generally, credit datasets are large and are characterized by redundant features and nonrelevant data. Hence, classification techniques and model accuracy could be hampered. To overcome this problem, this study explores a variety of filter and wrapper feature selection methods for reducing nonrelevant features. We argue that these two types of selection techniques are complementary to each other. A fusion strategy is then proposed to sequentially combine the ranking criteria of multiple filters and a wrapper method. Evaluations on three credit datasets show that feature subsets selected by fusion methods are either superior to or at least as adequate as those selected by individual methods.

[1]  Eric R. Ziegel,et al.  Multivariate Statistical Modelling Based on Generalized Linear Models , 2002, Technometrics.

[2]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[3]  Chih-Fong Tsai,et al.  Using neural network ensembles for bankruptcy prediction and credit scoring , 2008, Expert Syst. Appl..

[4]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[5]  Selwyn Piramuthu,et al.  On preprocessing data for financial credit risk evaluation , 2006, Expert Syst. Appl..

[6]  Charles Elkan,et al.  Quadratic Programming Feature Selection , 2010, J. Mach. Learn. Res..

[7]  David J. Hand,et al.  Measuring Diagnostic Accuracy of Statistical Prediction Rules , 2001 .

[8]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[9]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[10]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[11]  George Fernandez Statistical Data Mining Using SAS Applications , 2010 .

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Y. Liu,et al.  Data mining feature selection for credit scoring models , 2005, J. Oper. Res. Soc..

[14]  Yi Jiang Credit Scoring Model Based on the Decision Tree and the Simulated Annealing Algorithm , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[15]  L. Fahrmeir,et al.  Multivariate statistical modelling based on generalized linear models , 1994 .

[16]  Bouaguel Waad,et al.  An improvement direction for filter selection techniques using information theory measures and quadratic optimization , 2012, ArXiv.

[17]  Jonathan Crook,et al.  Support vector machines for credit scoring and discovery of significant features , 2009, Expert Syst. Appl..

[18]  Jun Gao,et al.  Rank Aggregation Based Text Feature Selection , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[19]  Pedro Larrañaga,et al.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS , 2005, J. Biomed. Informatics.

[20]  Michael G. Madden,et al.  The Effect of Principal Component Analysis on Machine Learning Accuracy with High Dimensional Spectral Data , 2005, SGAI Conf..

[21]  Sun-Yuan Kung,et al.  Fusion of feature selection methods for pairwise scoring SVM , 2008, Neurocomputing.

[22]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[23]  Stephen C. H. Leung,et al.  Vertical bagging decision trees model for credit scoring , 2010, Expert Syst. Appl..

[24]  A. Savvopoulos Consumer Credit Models: Pricing, Profit and Portfolios , 2010 .

[25]  Kassim Mwitondi Statistical data mining using SAS applications , 2012 .

[26]  Donald Goldfarb,et al.  A numerically stable dual method for solving strictly convex quadratic programs , 1983, Math. Program..

[27]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[28]  George Forman,et al.  BNS feature scaling: an improved representation over tf-idf for svm text classification , 2008, CIKM '08.

[29]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.