LASSO with Non-linear Measurements is Equivalent to One With Linear Measurements

Consider estimating an unknown, but structured (e.g. sparse, low-rank, etc.), signal x0 ∈ ℝn from a vector y ∈ ℝm of measurements of the form yi = gi(aiT x0), where the ai's are the rows of a known measurement matrix A, and, g(·) is a (potentially unknown) nonlinear and random link-function. Such measurement functions could arise in applications where the measurement device has nonlinearities and uncertainties. It could also arise by design, e.g., gi (x) = sign(x + zi), corresponds to noisy 1-bit quantized measurements. Motivated by the classical work of Brillinger, and more recent work of Plan and Vershynin, we estimate x0 via solving the Generalized-LASSO, i.e., x^ := arg minx ‖y - Ax0‖2 + λf(x) for some regularization parameter λ > 0 and some (typically non-smooth) convex regularizer f(·) that promotes the structure of x0, e.g. l1-norm, nuclear-norm, etc. While this approach seems to naively ignore the nonlinear function g(·), both Brillinger (in the non-constrained case) and Plan and Vershynin have shown that, when the entries of A are iid standard normal, this is a good estimator of x0 up to a constant of proportionality μ, which only depends on g(·). In this work, we considerably strengthen these results by obtaining explicit expressions for‖x^—μx0‖2, for the regularized Generalized-LASSO, that are asymptotically precise when m and n grow large. A main result is that the estimation performance of the Generalized LASSO with non-linear measurements is asymptotically the same as one whose measurements are linear yi = μaiTx0 + σzi, with μ = Eγg(γ) and σ2 = E(g(γ) - μγ)2, and, γ standard normal. To the best of our knowledge, the derived expressions on the estimation performance are the first-known precise results in this context. One interesting consequence of our result is that the optimal quantizer of the measurements that minimizes the estimation error of the Generalized LASSO is the celebrated Lloyd-Max quantizer.

[1]  Y. Gordon Some inequalities for Gaussian processes and applications , 1985 .

[2]  Volkan Cevher,et al.  A totally unimodular view of structured sparsity , 2014, AISTATS.

[3]  Constantine Caramanis,et al.  Optimal Linear Estimation under Unknown Nonlinear Transform , 2015, NIPS.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .

[6]  Andrea Montanari,et al.  The Noise-Sensitivity Phase Transition in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[7]  Christos Thrampoulidis,et al.  Asymptotically exact error analysis for the generalized equation-LASSO , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[8]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[9]  Christos Thrampoulidis,et al.  Precise error analysis of the LASSO , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Mihailo Stojnic,et al.  A framework to characterize performance of LASSO algorithms , 2013, ArXiv.

[11]  Yaniv Plan,et al.  The Generalized Lasso With Non-Linear Observations , 2015, IEEE Transactions on Information Theory.

[12]  I. Johnstone,et al.  Minimax Risk over l p-Balls for l q-error , 1994 .

[13]  Andrea Montanari,et al.  Accurate Prediction of Phase Transitions in Compressed Sensing via a Connection to Minimax Denoising , 2011, IEEE Transactions on Information Theory.

[14]  H. Ichimura,et al.  SEMIPARAMETRIC LEAST SQUARES (SLS) AND WEIGHTED SLS ESTIMATION OF SINGLE-INDEX MODELS , 1993 .

[15]  Christos Thrampoulidis,et al.  Regularized Linear Regression: A Precise Analysis of the Estimation Error , 2015, COLT.

[16]  Andrea Montanari,et al.  The LASSO Risk for Gaussian Matrices , 2010, IEEE Transactions on Information Theory.

[17]  Y. Plan,et al.  High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[18]  Christos Thrampoulidis,et al.  A Tight Version of the Gaussian min-max theorem in the Presence of Convexity , 2014, ArXiv.

[19]  Christos Thrampoulidis,et al.  The squared-error of generalized LASSO: A precise analysis , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[20]  Richard G. Baraniuk,et al.  Asymptotic Analysis of Complex LASSO via Complex Approximate Message Passing (CAMP) , 2011, IEEE Transactions on Information Theory.

[21]  D. Brillinger A Generalized Linear Model With “Gaussian” Regressor Variables , 2012 .

[22]  Christos Thrampoulidis,et al.  Asymptotically Exact Error Analysis for the Generalized $\ell_2^2$-LASSO , 2015, ISIT 2015.

[23]  Y. Gordon On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .

[24]  Douglas C. Montgomery,et al.  The Generalized Linear Model , 2012 .

[25]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2011 .

[26]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[27]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[28]  D. Brillinger The identification of a particular nonlinear time series system , 1977 .

[29]  I. Johnstone,et al.  Minimax risk overlp-balls forlp-error , 1994 .

[30]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[31]  Ker-Chau Li,et al.  Regression Analysis Under Link Violation , 1989 .

[32]  A. Garnham,et al.  A note on least squares sensitivity in single-index model estimation and the benefits of response transformations , 2013 .

[33]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[34]  Armeen Taeb,et al.  Maximin Analysis of Message Passing Algorithms for Recovering Block Sparse Signals , 2013, ArXiv.

[35]  Joel A. Tropp,et al.  Living on the edge: phase transitions in convex programs with random data , 2013, 1303.6672.