论文信息 - Feature subset selection for logistic regression via mixed integer optimization - 字舞流文

Feature subset selection for logistic regression via mixed integer optimization

This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness-of-fit measure. The purpose of our work is to establish a computational framework for selecting a subset of features with an optimality guarantee. For this purpose, we devise mixed integer optimization formulations for feature subset selection in logistic regression. Specifically, we pose the problem as a mixed integer linear optimization problem, which can be solved with standard mixed integer optimization software, by making a piecewise linear approximation of the logistic loss function. The computational results demonstrate that when the number of candidate features was less than 40, our method successfully provided a feature subset that was sufficiently close to an optimal one in a reasonable amount of time. Furthermore, even if there were more candidate features, our method often found a better subset of features than the stepwise methods did in terms of information criteria.

Toshiki Sato | Akiko Yoshise | Ryuhei Miyashiro | Yuichi Takano | Akiko Yoshise | Toshiki Sato | Yuichi Takano | Ryuhei Miyashiro

[1] M. Fireman,et al. MULTIPLE REGRESSION ANALYSIS OF SOIL DATA , 1954 .

[2] Edward I. Altman,et al. FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[3] D. McFadden. Conditional logit analysis of qualitative choice behavior , 1972 .

[4] C. L. Mallows. Some comments on C_p , 1973 .

[5] H. Akaike. A new look at the statistical model identification , 1974 .

[6] Keinosuke Fukunaga,et al. A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[7] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[8] Yadolah Dodge,et al. Mathematical Programming In Statistics , 1981 .

[9] C. J. Huberty,et al. Issues in the use and interpretation of discriminant analysis , 1984 .

[10] David W. Hosmer,et al. Applied Logistic Regression , 1991 .

[11] Baozong Yuan,et al. A more efficient branch and bound algorithm for feature selection , 1993, Pattern Recognit..

[12] C. Mallows. More comments on C p , 1995 .

[13] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[14] Pat Langley,et al. Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[15] C. H. Oh,et al. Some comments on , 1998 .

[16] E. George. The Variable Selection Problem , 2000 .

[17] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .

[18] David R. Anderson,et al. Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[19] Xue-wen Chen. An improved branch and bound algorithm for feature selection , 2003, Pattern Recognit. Lett..

[20] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[21] R. Stolzenberg,et al. Multiple Regression Analysis , 2004 .

[22] Josef Kittler,et al. Fast branch & bound algorithms for optimal feature selection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[24] Hiroshi Konno,et al. A MEAN-VARIANCE-SKEWNESS MODEL: ALGORITHM AND APPLICATIONS , 2005 .

[25] Rodney X. Sturdivant,et al. Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[26] Gordon V. Cormack,et al. Email Spam Filtering: A Systematic Review , 2008, Found. Trends Inf. Retr..

[27] Honglak Lee,et al. Efficient L1 Regularized Logistic Regression , 2006, AAAI.

[28] Hiroshi Motoda,et al. Computational Methods of Feature Selection , 2022 .

[29] David Casasent,et al. Adaptive branch and bound algorithm for selecting optimal features , 2007, Pattern Recognit. Lett..

[30] Stephen P. Boyd,et al. An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[31] S. Ulbrich,et al. MIXED INTEGER SECOND ORDER CONE PROGRAMMING , 2008 .

[32] Joaquín A. Pacheco,et al. A variable selection method based on Tabu search for logistic regression models , 2009, Eur. J. Oper. Res..

[33] Silvia Casado Yusta,et al. Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[34] Hiroshi Konno,et al. Choosing the best set of variables in regression analysis using integer programming , 2009, J. Glob. Optim..

[35] Dimitris Bertsimas,et al. Algorithm for cardinality-constrained quadratic optimization , 2009, Comput. Optim. Appl..

[36] Alper Ekrem Murat,et al. A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..

[37] Hiroshi Konno,et al. Multi-step methods for choosing the best set of variables in regression analysis , 2010, Comput. Optim. Appl..

[38] C. Mallows. Some Comments on Cp , 2000, Technometrics.

[39] Trevor Hastie,et al. An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[40] Ryuhei Miyashiro,et al. Subset selection by Mallows' C p , 2015 .

[41] D. Bertsimas,et al. Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[42] Ryuhei Miyashiro,et al. Mixed integer second-order cone programming formulations for variable selection in linear regression , 2015, Eur. J. Oper. Res..

[43] Ryuhei Miyashiro,et al. Subset selection by Mallows' Cp: A mixed integer programming approach , 2015, Expert Syst. Appl..

[44] Toshiki Sato,et al. Piecewise-Linear Approximation for Feature Subset Selection in a Sequential Logit Model , 2015, ArXiv.