Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted $\ell _{1}$ penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution under Gaussian random designs through solving the SLOPE problem using approximate message passing (AMP). This algorithmic approach allows us to approximate the SLOPE solution via the much more amenable AMP iterates. Explicitly, we characterize the asymptotic dynamics of the AMP iterates relying on a recently developed state evolution analysis for non-separable penalties, thereby overcoming the difficulty caused by the sorted $\ell _{1}$ penalty. Moreover, we prove that the AMP iterates converge to the SLOPE solution in an asymptotic sense, and numerical simulations show that the convergence is surprisingly fast. Our proof rests on a novel technique that specifically leverages the SLOPE problem. In contrast to prior literature, our work not only yields an asymptotically sharp analysis but also offers an algorithmic, flexible, and constructive approach to understanding the SLOPE problem.

[1]  Kellen Petersen August Real Analysis , 2009 .

[2]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[3]  Hong Hu,et al.  Asymptotics and Optimal Designs of SLOPE for Sparse Linear Regression , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[4]  Mike E. Davies,et al.  Near Optimal Compressed Sensing Without Priors: Parametric SURE Approximate Message Passing , 2014, IEEE Transactions on Signal Processing.

[5]  B. S. Kašin,et al.  DIAMETERS OF SOME FINITE-DIMENSIONAL SETS AND CLASSES OF SMOOTH FUNCTIONS , 1977 .

[6]  A. Tsybakov,et al.  Slope meets Lasso: Improved oracle bounds and optimality , 2016, The Annals of Statistics.

[7]  Mário A. T. Figueiredo,et al.  Decreasing Weighted Sorted ℓ1 Regularization , 2014, ArXiv.

[8]  Philip Schniter,et al.  Expectation-maximization Bernoulli-Gaussian approximate message passing , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[9]  A. Montanari,et al.  Fundamental barriers to high-dimensional regression with convex penalties , 2019, The Annals of Statistics.

[10]  Andrea Montanari,et al.  Universality in Polytope Phase Transitions and Message Passing Algorithms , 2012, ArXiv.

[11]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[12]  Richard G. Baraniuk,et al.  Consistent Parameter Estimation for LASSO and Approximate Message Passing , 2015, The Annals of Statistics.

[13]  Z. Bai,et al.  Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .

[14]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  Richard G. Baraniuk,et al.  From Denoising to Compressed Sensing , 2014, IEEE Transactions on Information Theory.

[17]  Sundeep Rangan,et al.  Vector approximate message passing , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[18]  Sundeep Rangan,et al.  Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.

[19]  Robert D. Nowak,et al.  Ordered Weighted L1 Regularized Regression with Strongly Correlated Covariates: Theoretical Aspects , 2016, AISTATS.

[20]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[21]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[22]  Emmanuel J. Candès,et al.  False Discoveries Occur Early on the Lasso Path , 2015, ArXiv.

[23]  A. Maleki,et al.  Does SLOPE outperform bridge regression? , 2019, ArXiv.

[24]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[25]  Weijie J. Su,et al.  Group SLOPE – Adaptive Selection of Groups of Predictors , 2015, Journal of the American Statistical Association.

[26]  L. Pitt Positively Correlated Normal Variables are Associated , 1982 .

[27]  Florent Krzakala,et al.  Approximate message-passing for convex optimization with non-separable penalties , 2018, ArXiv.

[28]  Antonin Chambolle,et al.  Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage , 1998, IEEE Trans. Image Process..

[29]  Florent Krzakala,et al.  Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices , 2012, ArXiv.

[30]  Emmanuel J. Candès,et al.  SLOPE is Adaptive to Unknown Sparsity and Asymptotically Minimax , 2015, ArXiv.

[31]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[32]  Weijie J. Su,et al.  SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION. , 2014, The annals of applied statistics.

[33]  Philip Schniter,et al.  Expectation-Maximization Gaussian-Mixture Approximate Message Passing , 2012, IEEE Transactions on Signal Processing.

[34]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[35]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[36]  Philip Schniter,et al.  Learning and free energies for vector approximate message passing , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  M. Ledoux The concentration of measure phenomenon , 2001 .

[38]  Andrea Montanari,et al.  The Noise-Sensitivity Phase Transition in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[39]  Andrea Montanari,et al.  State Evolution for Approximate Message Passing with Non-Separable Functions , 2017, Information and Inference: A Journal of the IMA.

[40]  Philip Schniter,et al.  Onsager-corrected deep learning for sparse linear inverse problems , 2016, 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[41]  Andrea Montanari,et al.  The LASSO Risk for Gaussian Matrices , 2010, IEEE Transactions on Information Theory.

[42]  M. Rudelson,et al.  Smallest singular value of random matrices and geometry of random polytopes , 2005 .

[43]  Ramji Venkataramanan,et al.  Finite Sample Analysis of Approximate Message Passing Algorithms , 2016, IEEE Transactions on Information Theory.

[44]  Andrea Montanari,et al.  Estimating LASSO Risk and Noise Level , 2013, NIPS.

[45]  Sundeep Rangan,et al.  Plug in estimation in high dimensional linear inverse problems a rigorous analysis , 2018, NeurIPS.

[46]  Andrea Montanari,et al.  Graphical Models Concepts in Compressed Sensing , 2010, Compressed Sensing.

[47]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[48]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.