论文信息 - Optimal Linear Estimation under Unknown Nonlinear Transform - 字舞流文

Optimal Linear Estimation under Unknown Nonlinear Transform

Linear regression studies the problem of estimating a model parameter β* ∈ℝ p , from n observations [Formula: see text] from linear model yi = 〈xi , β*〉 + ε i . We consider a significant generalization in which the relationship between 〈xi , β*〉 and yi is noisy, quantized to a single bit, potentially nonlinear, noninvertible, as well as unknown. This model is known as the single-index model in statistics, and, among other things, it represents a significant generalization of one-bit compressed sensing. We propose a novel spectral-based estimation procedure and show that we can recover β* in settings (i.e., classes of link function f) where previous algorithms fail. In general, our algorithm requires only very mild restrictions on the (unknown) functional relationship between yi and 〈xi , β*〉. We also consider the high dimensional setting where β* is sparse, and introduce a two-stage nonconvex framework that addresses estimation challenges in high dimensional regimes where p ≫ n. For a broad class of link functions between 〈xi , β*〉 and yi , we establish minimax lower bounds that demonstrate the optimality of our estimators in both the classical and high dimensional regimes.

Constantine Caramanis | Zhaoran Wang | Xinyang Yi | Han Liu | Zhaoran Wang | Han Liu | C. Caramanis | Xinyang Yi

[1] Alexandre d'Aspremont,et al. Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..

[2] Laurent Jacques,et al. Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors , 2011, IEEE Transactions on Information Theory.

[3] Yonina C. Eldar,et al. Phase Retrieval via Matrix Completion , 2011, SIAM Rev..

[4] Bin Yu. Assouad, Fano, and Le Cam , 1997 .

[5] Jing Lei,et al. Fantope Projection and Selection: A near-optimal convex relaxation of sparse PCA , 2013, NIPS.

[6] R. Cook,et al. Principal Hessian Directions Revisited , 1998 .

[7] Michael I. Jordan,et al. A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[8] P. Massart,et al. Concentration inequalities and model selection , 2007 .

[9] Pierre Alquier,et al. Sparse single-index model , 2011, J. Mach. Learn. Res..

[10] R. Tibshirani,et al. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[11] Prateek Jain,et al. One-Bit Compressed Sensing: Provable Support and Vector Recovery , 2013, ICML.

[12] Yonina C. Eldar,et al. Phase Retrieval via Matrix Completion , 2013, SIAM J. Imaging Sci..

[13] Yaniv Plan,et al. Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[14] Adam Tauman Kalai,et al. The Isotron Algorithm: High-Dimensional Isotonic Regression , 2009, COLT.

[15] Yonina C. Eldar,et al. Phase Retrieval: Stability and Recovery Guarantees , 2012, ArXiv.

[16] Michel Delecroix,et al. Optimal smoothing in semiparametric index approximation of regression functions , 2000 .

[17] A. Juditsky,et al. Direct estimation of the index coefficient in a single-index model , 2001 .

[18] Philippe Rigollet,et al. Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[19] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[20] Yurii Nesterov,et al. Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[21] Ker-Chau Li,et al. Sliced Inverse Regression for Dimension Reduction , 1991 .

[22] Christopher D. Manning,et al. Robust Logistic Regression using Shift Parameters , 2013, ACL.

[23] Ker-Chau Li,et al. On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[24] Thomas M. Stoker. Consistent estimation of scaled coefficients , 2011 .

[25] Yaniv Plan,et al. One‐Bit Compressed Sensing by Linear Programming , 2011, ArXiv.

[26] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[27] Nagarajan Natarajan,et al. Learning with Noisy Labels , 2013, NIPS.

[28] Constantine Caramanis,et al. A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise , 2013, ArXiv.

[29] R. Cook,et al. Dimension Reduction in Binary Response Regression , 1999 .

[30] Sara van de Geer,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[31] W. Härdle,et al. Optimal Smoothing in Single-index Models , 1993 .

[32] Xiao-Tong Yuan,et al. Truncated power method for sparse eigenvalue problems , 2011, J. Mach. Learn. Res..

[33] Thomas M. Stoker,et al. Semiparametric Estimation of Index Coefficients , 1989 .

[34] Zongming Ma. Sparse Principal Component Analysis and Iterative Thresholding , 2011, 1112.2432.

[35] T. Cai,et al. Sparse PCA: Optimal rates and adaptive estimation , 2012, 1211.1309.

[36] Christopher D. Manning,et al. Robust Logistic Regression using Shift Parameters (Long Version) , 2013 .

[37] Xiaodong Li,et al. Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.

[38] Y. Plan,et al. High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[39] Adam Tauman Kalai,et al. Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression , 2011, NIPS.

[40] H. Zou. The Adaptive Lasso and Its Oracle Properties , 2006 .

[41] R. Tibshirani,et al. Sparse Principal Component Analysis , 2006 .

[42] Jianhua Z. Huang,et al. Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[43] M. Hristache,et al. On Semiparametric estimation in Single-Index Regression , 2006 .