Supervised Homogeneity Fusion: a Combinatorial Approach

Fusing regression coefficients into homogenous groups can unveil those coefficients that share a common value within each group. Such groupwise homogeneity reduces the intrinsic dimension of the parameter space and unleashes sharper statistical accuracy. We propose and investigate a new combinatorial grouping approach called L0-Fusion that is amenable to mixed integer optimization (MIO). On the statistical aspect, we identify a fundamental quantity called grouping sensitivity that underpins the difficulty of recovering the true groups. We show that L0-Fusion achieves grouping consistency under the weakest possible requirement of the grouping sensitivity: if this requirement is violated, then the minimax risk of group misspecification will fail to converge to zero. Moreover, we show that in the high-dimensional regime, one can apply L0-Fusion coupled with a sure screening set of features without any essential loss of statistical efficiency, while reducing the computational cost substantially. On the algorithmic aspect, we provide a MIO formulation for L0-Fusion along with a warm start strategy. Simulation and real data analysis demonstrate that L0-Fusion exhibits superiority over its competitors in terms of grouping accuracy.

[1]  G. Dantzig ON THE SIGNIFICANCE OF SOLVING LINEAR PROGRAMMING PROBLEMS WITH SOME INTEGER VARIABLES , 1960 .

[2]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[3]  M. Kendall,et al.  The discarding of variables in multivariate analysis. , 1967, Biometrika.

[4]  Hosik Choi,et al.  Homogeneity detection for the high-dimensional generalized linear model , 2017, Comput. Stat. Data Anal..

[5]  Jianqing Fan,et al.  Homogeneity Pursuit , 2015, Journal of the American Statistical Association.

[6]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Xuming He,et al.  Inference for Subgroup Analysis With a Structured Logistic-Normal Mixture Model , 2015 .

[9]  Ana L. N. Fred,et al.  Robust data clustering , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Xiaotong Shen,et al.  Simultaneous Grouping Pursuit and Feature Selection Over an Undirected Graph , 2013, Journal of the American Statistical Association.

[11]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[12]  Wenyang Zhang,et al.  Homogeneity Pursuit in Single Index Models based Panel Data Analysis , 2017, Journal of Business & Economic Statistics.

[13]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[14]  Lucien Birgé Approximation dans les espaces métriques et théorie de l'estimation , 1983 .

[15]  Tracey J. Woodruff,et al.  Estimated Daily Phthalate Exposures in a Population of Mothers of Male Infants Exhibiting Reduced Anogenital Distance , 2006, Environmental health perspectives.

[16]  Jian Huang,et al.  A Concave Pairwise Fusion Approach to Subgroup Analysis , 2015, 1508.07045.

[17]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[18]  M. J. Garside,et al.  The Best Sub‐Set in Multiple Regression Analysis , 1965 .

[19]  F. Vendittelli,et al.  Obstetrical outcomes and biomarkers to assess exposure to phthalates: A review. , 2015, Environment international.

[20]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[21]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[22]  Michael Jünger,et al.  Facets of Combinatorial Optimization , 2013 .

[23]  George L. Nemhauser,et al.  Modeling disjunctive constraints with a logarithmic number of binary variables and constraints , 2008, Math. Program..

[24]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[25]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[26]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[27]  J. Schwartz,et al.  Assessing windows of susceptibility to lead-induced cognitive deficits in Mexican children. , 2012, Neurotoxicology.

[28]  I E Auger,et al.  Algorithms for the optimal identification of segment neighborhoods. , 1989, Bulletin of mathematical biology.

[29]  S. Swan,et al.  Socioeconomic factors and phthalate metabolite concentrations among United States women of reproductive age. , 2012, Environmental research.

[30]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[31]  Jianqing Fan,et al.  When is best subset selection the "best"? , 2020 .

[32]  T. Schettler,et al.  Human exposure to phthalates via consumer products. , 2006, International journal of andrology.

[33]  Xiaotong Shen,et al.  Grouping Pursuit Through a Regularization Solution Surface , 2010, Journal of the American Statistical Association.

[34]  R. R. Hocking,et al.  Selection of the Best Subset in Regression Analysis , 1967 .

[35]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[36]  Shihao Wu,et al.  On the early solution path of best subset selection , 2021 .

[37]  A. S. Manne,et al.  On the Solution of Discrete Programming Problems , 1956 .