Feature Selection via Least Squares Support Feature Machine

In many applications such as credit risk management, data are represented as high-dimensional feature vectors. It makes the feature selection necessary to reduce the computational complexity, improve the generalization ability and the interpretability. In this paper, we present a novel feature selection method — "Least Squares Support Feature Machine" (LS-SFM). The proposed method has two advantages comparing with conventional Support Vector Machine (SVM) and LS-SVM. First, the convex combinations of basic kernels are used as the kernel and each basic kernel makes use of a single feature. It transforms the feature selection problem that cannot be solved in the context of SVM to an ordinary multiple-parameter learning problem. Second, all parameters are learned by a two stage iterative algorithm. A 1-norm based regularized cost function is used to enforce sparseness of the feature parameters. The "support features" refer to the respective features with nonzero feature parameters. Experimental study on some of the UCI datasets and a commercial credit card dataset demonstrates the effectiveness and efficiency of the proposed approach.

[1]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[2]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  InzaIñaki,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004 .

[5]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Chih-Chou Chiu,et al.  Credit scoring using the hybrid neural discriminant technique , 2002, Expert Syst. Appl..

[8]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[9]  Yingxu Yang,et al.  Adaptive credit scoring with kernel learning methods , 2007, Eur. J. Oper. Res..

[10]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[11]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[12]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[13]  Toshiko Wakaki,et al.  A study on rough set-aided feature selection for automatic web-page classification , 2006, Web Intell. Agent Syst..

[14]  Yi Peng,et al.  Data Mining via Multiple Criteria Linear Programming: Applications in Credit Card Portfolio Management , 2002, Int. J. Inf. Technol. Decis. Mak..

[15]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[16]  Gunnar Rätsch,et al.  Learning Interpretable SVMs for Biological Sequence Classification , 2005, BMC Bioinformatics.

[17]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[18]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[19]  David West,et al.  Neural network credit scoring models , 2000, Comput. Oper. Res..

[20]  Tian-Shyug Lee,et al.  A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines , 2005, Expert Syst. Appl..

[21]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[22]  Yong Shi,et al.  Classifications Of Credit Cardholder Behavior By Using Fuzzy Linear Programming , 2004, Int. J. Inf. Technol. Decis. Mak..

[23]  Kezhi Mao,et al.  Feature subset selection for support vector machines through discriminative function pruning analysis , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Steve R. Gunn,et al.  Structural Modelling with Sparse Kernels , 2002, Machine Learning.

[25]  Zhengxin Chen,et al.  Classifying Credit Card Accounts for Business Intelligence and Decision Making: a Multiple-criteria Quadratic Programming Approach , 2005, Int. J. Inf. Technol. Decis. Mak..

[26]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[27]  David R. Musicant,et al.  Large Scale Kernel Regression via Linear Programming , 2002, Machine Learning.

[28]  Sheng-Tun Li,et al.  The evaluation of consumer loans using support vector machines , 2006, Expert Syst. Appl..

[29]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[30]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[31]  David P. Helmbold,et al.  Boosting Methods for Regression , 2002, Machine Learning.

[32]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[33]  Licheng Jiao,et al.  Simultaneous Feature Selection and Parameters Optimization for SVM by Immune Clonal Algorithm , 2005, ICNC.

[34]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[35]  LeeTian-Shyug,et al.  A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines , 2005 .

[36]  Yi Liu,et al.  FS_SFS: A novel feature selection method for support vector machines , 2006, Pattern Recognit..

[37]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[38]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[39]  Masakazu Muramatsu,et al.  An Efficient Support Vector Machine Learning Method with Second-Order Cone Programming for Large-Scale Problems , 2005, Applied Intelligence.

[40]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[41]  Georges Dupret,et al.  Bootstrap re-sampling for unbalanced data in supervised learning , 2001, Eur. J. Oper. Res..

[42]  M. Zekic-Susac,et al.  Small business credit scoring: a comparison of logistic regression, neural network, and decision tree models , 2004, 26th International Conference on Information Technology Interfaces, 2004..

[43]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[44]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[45]  Qian Feng Soft Sensor Modeling Based on PCA and Support Vector Machines , 2006 .

[46]  Mu-Chen Chen,et al.  Credit scoring with a data mining approach based on support vector machines , 2007, Expert Syst. Appl..

[47]  Yong Shi,et al.  Classifications of Credit Cardholder Behavior by Using Multiple Criteria Non-linear Programming , 2004, CASDMKM.

[48]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[49]  Yi Peng,et al.  Discovering Credit Cardholders’ Behavior by Multiple Criteria Linear Programming , 2005, Ann. Oper. Res..

[50]  Gavin C. Cawley,et al.  Fast exact leave-one-out cross-validation of sparse least-squares support vector machines , 2004, Neural Networks.

[51]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.