Convergence guarantee for the sparse monotone single index model

We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone and non-decreasing shape. We develop a scalable projection-based iterative approach, the “Sparse Orthogonal Descent Single-Index Model” (SOD-SIM), which alternates between sparsethresholded orthogonalized “gradient-like” steps and isotonic regression steps to recover the coefficient vector. Our main contribution is that we provide finite sample estimation bounds for both the coefficient vector and the link function in high-dimensional settings under very mild assumptions on the design matrix X, the error term , and their dependence. The convergence rate for the link function matched the low-dimensional isotonic regression minimax rate up to some poly-log terms (n−1/3). The convergence rate for the coefficients is also n−1/3 up to some poly-log terms. This method can be applied to many real data problems, including GLMs with misspecified link, classification with mislabeled data, and classification with positive-unlabeled (PU) data. We study the performance of this method via both numerical studies and also an application on a rocker protein sequence data.

[1]  Lixing Zhu,et al.  The EFM approach for single-index models , 2011, 1211.5220.

[2]  Trevor Hastie,et al.  Inference from presence-only data; the ongoing controversy. , 2013, Ecography.

[3]  Adam Tauman Kalai,et al.  Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression , 2011, NIPS.

[4]  Garvesh Raskutti,et al.  PUlasso: High-Dimensional Variable Selection With Presence-Only Data , 2017, Journal of the American Statistical Association.

[5]  Ker-Chau Li,et al.  Slicing Regression: A Link-Free Regression Method , 1991 .

[6]  W. Härdle,et al.  Optimal Smoothing in Single-index Models , 1993 .

[7]  P. Groeneboom,et al.  Score estimation in the monotone single‐index model , 2017, Scandinavian Journal of Statistics.

[8]  Robert D. Nowak,et al.  On Learning High Dimensional Structured Single Index Models , 2016, AAAI.

[9]  Y. Plan,et al.  High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[10]  Jeng-Min Chiou,et al.  Quasi-Likelihood Regression with Unknown Link and Variance Functions , 1998 .

[11]  Pierre Alquier,et al.  Sparse single-index model , 2011, J. Mach. Learn. Res..

[12]  W. Härdle,et al.  Efficient estimation in conditional single-index regression , 2003 .

[13]  Yaniv Plan,et al.  One-bit compressed sensing with non-Gaussian measurements , 2012, ArXiv.

[14]  Gevorg Grigoryan,et al.  De novo design of a transmembrane Zn2+-transporting four-helix bundle , 2014, Science.

[15]  Tao Wang,et al.  Non-convex penalized estimation in high-dimensional models with single-index structure , 2012, J. Multivar. Anal..

[16]  F. Balabdaoui,et al.  Least squares estimation in the monotone single index model , 2016, Bernoulli.

[17]  Adam Tauman Kalai,et al.  The Isotron Algorithm: High-Dimensional Isotonic Regression , 2009, COLT.

[18]  R. Dykstra,et al.  A Method for Finding Projections onto the Intersection of Convex Sets in Hilbert Spaces , 1986 .

[19]  R. Spady,et al.  AN EFFICIENT SEMIPARAMETRIC ESTIMATOR FOR BINARY RESPONSE MODELS , 1993 .

[20]  Arindam Banerjee,et al.  Robust Structured Estimation with Single-Index Models , 2017, ICML.

[21]  Rina Foygel Barber,et al.  Contraction and uniform convergence of isotonic regression , 2017, Electronic Journal of Statistics.

[22]  W. Härdle,et al.  Direct Semiparametric Estimation of Single-Index Models with Discrete Covariates dpsfb950075.ps.tar = Enno MAMMEN J.S. MARRON: Mass Recentered Kernel Smoothers , 1996 .

[23]  Thomas M. Stoker,et al.  Semiparametric Estimation of Index Coefficients , 1989 .

[24]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[25]  Yaniv Plan,et al.  Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[26]  Peter Radchenko,et al.  High dimensional single index models , 2015, J. Multivar. Anal..

[27]  Adityanand Guntuboyina,et al.  On risk bounds in isotonic and other shape restricted regression problems , 2013, 1311.3765.

[28]  A. Juditsky,et al.  Direct estimation of the index coefficient in a single-index model , 2001 .

[29]  Jun S. Liu,et al.  On consistency and sparsity for sliced inverse regression in high dimensions , 2015, 1507.03895.

[30]  Jianqing Fan,et al.  Generalized Partially Linear Single-Index Models , 1997 .

[31]  Arun K. Kuchibhotla,et al.  Efficient estimation in single index models through smoothing splines , 2016, Bernoulli.

[32]  Yaniv Plan,et al.  The Generalized Lasso With Non-Linear Observations , 2015, IEEE Transactions on Information Theory.

[33]  H. Ichimura,et al.  SEMIPARAMETRIC LEAST SQUARES (SLS) AND WEIGHTED SLS ESTIMATION OF SINGLE-INDEX MODELS , 1993 .