Stable Feature Selection with Support Vector Machines

The support vector machine (SVM) is a popular method for classification, well known for finding the maximum-margin hyperplane. Combining SVM with \(l_{1}\)-norm penalty further enables it to simultaneously perform feature selection and margin maximization within a single framework. However, \(l_{1}\)-norm SVM shows instability in selecting features in presence of correlated features. We propose a new method to increase the stability of \(l_{1}\)-norm SVM by encouraging similarities between feature weights based on feature correlations, which is captured via a feature covariance matrix. Our proposed method can capture both positive and negative correlations between features. We formulate the model as a convex optimization problem and propose a solution based on alternating minimization. Using both synthetic and real-world datasets, we show that our model achieves better stability and classification accuracy compared to several state-of-the-art regularized classification methods.

[1]  S. Geer,et al.  Correlated variables in regression: Clustering and sparse estimation , 2012, 1209.5908.

[2]  Svetha Venkatesh,et al.  Stable feature selection for clinical prediction: Exploiting ICD tree structure using Tree-Lasso , 2015, J. Biomed. Informatics.

[3]  Chris H. Q. Ding,et al.  Stable feature selection via dense feature groups , 2008, KDD.

[4]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[5]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[6]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[7]  Li Liang,et al.  Validated, electronic health record deployable prediction models for assessing patient risk of 30-day rehospitalization and mortality in older heart failure patients. , 2013, JACC. Heart failure.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  J. Mair,et al.  Cardiac troponin T in diagnosis of acute myocardial infarction. , 1991, Clinical chemistry.

[10]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[11]  Eytan Domany,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2004, Breast Cancer Research.

[12]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[13]  J. Caro,et al.  Anemia as an independent prognostic factor for survival in patients with cancer , 2001 .

[14]  Xiaohui Xie,et al.  Efficient variable selection in support vector machines via the alternating direction method of multipliers , 2011, AISTATS.

[15]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[16]  H. Zou,et al.  The doubly regularized support vector machine , 2006 .

[17]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[18]  M. Thun,et al.  Diabetes mellitus as a predictor of cancer mortality in a large cohort of US adults. , 2004, American journal of epidemiology.

[19]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[20]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[21]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..