Learning Single-Index Models in Gaussian Space

We consider regression problems where the response is a smooth but non-linear function of a kdimensional projection of p normally-distributed covariates, contaminated with additive Gaussian noise. The goal is to recover the range of the k-dimensional projection, i.e., the index space. This model is called the multi-index model, and the k = 1 case is called the single-index model. For the single-index model, we characterize the population landscape of a natural semi-parametric maximum likelihood objective in terms of the link function and prove that it has no spurious local minima. We also propose and analyze an efficient iterative procedure that recovers the index space up to error using a sample size Õ(p 2/μ)+p/ ), whereR and μ, respectively, parameterize the smoothness of the link function and the signal strength. When a multi-index model is incorrectly specified as a single-index model, we prove that essentially the same procedure, with sample size Õ(p /μ) + p/ ), returns a vector that is -close to being completely in the index space.

[1]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[2]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[3]  Tengyu Ma,et al.  Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.

[4]  Francis Bach,et al.  Slice inverse regression with score functions , 2018 .

[5]  J. Horowitz Semiparametric and Nonparametric Methods in Econometrics , 2007 .

[6]  Yaniv Plan,et al.  One-bit compressed sensing with non-Gaussian measurements , 2012, ArXiv.

[7]  John Wright,et al.  When Are Nonconvex Problems Not Scary? , 2015, ArXiv.

[8]  A. Juditsky,et al.  Direct estimation of the index coefficient in a single-index model , 2001 .

[9]  Zhaoran Wang,et al.  Agnostic Estimation for Misspecified Phase Retrieval Models , 2020, NIPS.

[10]  Arnak S. Dalalyan,et al.  A New Algorithm for Estimating the Effective Dimension-Reduction Subspace , 2008, J. Mach. Learn. Res..

[11]  J. Polzehl,et al.  Structure adaptive approach for dimension reduction , 2001 .

[12]  Xiaodong Li,et al.  Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.

[13]  Y. Plan,et al.  High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[14]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[15]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[16]  D. Brillinger A Generalized Linear Model With “Gaussian” Regressor Variables , 2012 .