Mixed integer second-order cone programming formulations for variable selection in linear regression

This study concerns a method of selecting the best subset of explanatory variables in a multiple linear regression model. Goodness-of-fit measures, for example, adjusted R2, AIC, and BIC, are generally used to evaluate a subset regression model. Although variable selection with regard to these measures is usually performed with a stepwise regression method, it does not always provide the best subset of explanatory variables. In this paper, we propose mixed integer second-order cone programming formulations for selecting the best subset of variables with respect to adjusted R2, AIC, and BIC. Computational experiments show that, in terms of these measures, the proposed formulations yield better solutions than those provided by common stepwise regression methods.

[1]  Chih-Ling Tsai,et al.  Regression model selection—a residual likelihood approach , 2002 .

[2]  Toshiki Sato,et al.  Feature subset selection for logistic regression via mixed integer optimization , 2016, Computational Optimization and Applications.

[3]  Hande Y. Benson,et al.  Mixed-Integer Second-Order Cone Programming: A Survey , 2013 .

[4]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[5]  Chenlei Leng,et al.  The Residual Information Criterion, Corrected , 2007, 0711.1918.

[6]  Erricos John Kontoghiorghes,et al.  A branch and bound algorithm for computing the best subset regression models , 2002 .

[7]  S. Leyffer,et al.  Comparison of certain MINLP algorithms when applied to a model structure determination and parameter estimation problem , 1998 .

[8]  Stefan Emet A Model Identification Approach Using MINLP Techniques , 2006 .

[9]  Marc Hofmann,et al.  Efficient algorithms for computing the best subset regression models for large-scale problems , 2007, Comput. Stat. Data Anal..

[10]  Hiroshi Konno,et al.  Multi-step methods for choosing the best set of variables in regression analysis , 2010, Comput. Optim. Appl..

[11]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[12]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.

[13]  Alan J. Miller Subset Selection in Regression , 1992 .

[14]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[15]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[16]  R. Wherry,et al.  A New Formula for Predicting the Shrinkage of the Coefficient of Multiple Correlation , 1931 .

[17]  Yadolah Dodge,et al.  Mathematical Programming In Statistics , 1981 .

[18]  Donald Goldfarb,et al.  Second-order cone programming , 2003, Math. Program..

[19]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[20]  Alexander J. Smola,et al.  Second Order Cone Programming Approaches for Handling Missing and Uncertain Data , 2006, J. Mach. Learn. Res..

[21]  Ryuhei Miyashiro,et al.  Subset selection by Mallows' C p , 2015 .

[22]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[23]  Theodore B. Trafalis,et al.  Robust classification and regression using support vector machines , 2006, Eur. J. Oper. Res..

[24]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[25]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[26]  H. Theil,et al.  Economic Forecasts and Policy. , 1959 .

[27]  Dimitris Bertsimas,et al.  Algorithm for cardinality-constrained quadratic optimization , 2009, Comput. Optim. Appl..

[28]  Hiroshi Konno,et al.  Choosing the best set of variables in regression analysis using integer programming , 2009, J. Glob. Optim..

[29]  H. Theil,et al.  Economic Forecasts and Policy. , 1959 .

[30]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[31]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[32]  Robert W. Wilson,et al.  Regressions by Leaps and Bounds , 2000, Technometrics.

[33]  R. Stolzenberg,et al.  Multiple Regression Analysis , 2004 .

[34]  H. Akaike A new look at the statistical model identification , 1974 .

[35]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[36]  Venkat Reddy Konasani,et al.  Multiple Regression Analysis , 2015 .

[37]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[38]  Ryuhei Miyashiro,et al.  Subset selection by Mallows' Cp: A mixed integer programming approach , 2015, Expert Syst. Appl..

[39]  Robert P Freckleton,et al.  Why do we still use stepwise modelling in ecology and behaviour? , 2006, The Journal of animal ecology.

[40]  Masakazu Muramatsu,et al.  An Efficient Support Vector Machine Learning Method with Second-Order Cone Programming for Large-Scale Problems , 2005, Applied Intelligence.

[41]  Lisa Turner,et al.  Applications of Second Order Cone Programming , 2012 .

[42]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[43]  M. Yuan,et al.  Dimension reduction and coefficient estimation in multivariate linear regression , 2007 .

[44]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[45]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[46]  A. Atkinson Subset Selection in Regression , 1992 .