Inference In High-dimensional Single-Index Models Under Symmetric Designs

We consider the problem of statistical inference for a finite number of covariates in a generalized single-index model with p > n covariates and unknown (potentially random) link function under an elliptically symmetric design. Under elliptical symmetry, the problem can be reformulated as a proxy linear model in terms of an identifiable parameter, which characterization is then used to construct estimates of the regression coefficients of interest that are similar to the de-biased lasso estimates in the standard linear model and exhibit similar properties: square-root consistency and asymptotic normality. The procedure is agnostic in the sense that it completely bypasses the estimation of the link function, which can be extremely challenging depending on the underlying structure of the problem. Our method allows testing for the importance of pre-fixed covariates in the single-index model, as well as testing for the relative importance of coefficients via straightforward application of the delta method. Furthermore, under Gaussianity, we extend our approach to prescribe improved, i.e., more efficient estimates of the coefficients using a sieved strategy that involves an expansion of the true regression function in terms of Hermite polynomials. Finally, we illustrate our approach via carefully designed simulation experiments.

[1]  D. Pollard,et al.  Cube Root Asymptotics , 1990 .

[2]  Robert D. Nowak,et al.  Learning Single Index Models in High Dimensions , 2015, ArXiv.

[3]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[4]  Non-Standard Asymptotics in High Dimensions: Manski's Maximum Score Estimator Revisited , 2019, 1903.10063.

[5]  J. Horowitz Semiparametric and Nonparametric Methods in Econometrics , 2007 .

[6]  Daniel J. Hsu,et al.  Learning Single-Index Models in Gaussian Space , 2018, COLT.

[7]  Krishnakumar Balasubramanian,et al.  High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation , 2017, ICML.

[8]  Subhashis Ghosal,et al.  Forward selection and estimation in high dimensional single index models , 2016 .

[9]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[10]  Y. Zhu,et al.  BS-SIM: An effective variable selection method for high-dimensional single index model , 2017 .

[11]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[12]  Adel Javanmard,et al.  Debiasing the lasso: Optimal sample size for Gaussian designs , 2015, The Annals of Statistics.

[13]  Yaniv Plan,et al.  The Generalized Lasso With Non-Linear Observations , 2015, IEEE Transactions on Information Theory.

[14]  Stefan Wager,et al.  Debiased Inference of Average Partial Effects in Single-Index Models , 2018, 1811.02547.

[15]  C. Manski MAXIMUM SCORE ESTIMATION OF THE STOCHASTIC UTILITY MODEL OF CHOICE , 1975 .

[16]  Xiaohan Wei,et al.  Structured Signal Recovery From Non-Linear and Heavy-Tailed Measurements , 2016, IEEE Transactions on Information Theory.

[17]  Adrian Basarab,et al.  Reconstruction of compressively sampled ultrasound images using dual prior information , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[18]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[19]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[20]  C. Manski Semiparametric analysis of discrete response: Asymptotic properties of the maximum score estimator , 1985 .

[21]  Peter Radchenko,et al.  High dimensional single index models , 2015, J. Multivar. Anal..

[22]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[23]  Ker-Chau Li,et al.  Regression Analysis Under Link Violation , 1989 .

[24]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[25]  Christos Thrampoulidis,et al.  LASSO with Non-linear Measurements is Equivalent to One With Linear Measurements , 2015, NIPS.

[26]  D. Brillinger A Generalized Linear Model With “Gaussian” Regressor Variables , 2012 .

[27]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[28]  M. Kosorok,et al.  The Change-Plane Cox Model. , 2018, Biometrika.

[29]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[30]  R. Baraniuk,et al.  Compressive Radar Imaging , 2007, 2007 IEEE Radar Conference.

[31]  Jianqing Fan,et al.  Generalized Partially Linear Single-Index Models , 1997 .

[32]  Lars Larsson-Cohn,et al.  Lp-norms of Hermite polynomials and an extremal problem on Wiener chaos , 2002 .

[33]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[34]  Bin Nan,et al.  Variable selection in monotone single‐index models via the adaptive LASSO , 2013, Statistics in medicine.

[35]  Dean P. Foster,et al.  Single-Index Models in the High Signal Regime , 2021, IEEE Transactions on Information Theory.

[36]  A. Bulinski Conditional Central Limit Theorem , 2017 .

[37]  Robert M. Gray,et al.  Toeplitz and Circulant Matrices: A Review , 2005, Found. Trends Commun. Inf. Theory.

[38]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[39]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[40]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[41]  Xiaobo Zhang,et al.  A survey on one-bit compressed sensing: theory and applications , 2018, Frontiers Comput. Sci..