论文信息 - High Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transtition

High Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transtition

We consider a sparse linear regression model Y=X\beta^{*}+W where X has a Gaussian entries, W is the noise vector with mean zero Gaussian entries, and \beta^{*} is a binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error \min_{\beta}\|Y-X\beta\|_{2}, where the minimization is over all k-sparse binary vectors \beta. The approximation reveals interesting structural properties of the underlying regression problem. In particular, a) We establish that n^*=2k\log p/\log (2k/\sigma^{2}+1) is a phase transition point with the following "all-or-nothing" property. When n exceeds n^{*}, (2k)^{-1}\|\beta_{2}-\beta^*\|_0\approx 0, and when n is below n^{*}, (2k)^{-1}\|\beta_{2}-\beta^*\|_0\approx 1, where \beta_2 is the optimal solution achieving the smallest squared error. With this we prove that n^{*} is the asymptotic threshold for recovering \beta^* information theoretically. b) We compute the squared error for an intermediate problem \min_{\beta}\|Y-X\beta\|_{2} where minimization is restricted to vectors \beta with \|\beta-\beta^{*}\|_0=2k \zeta, for \zeta\in [0,1]. We show that a lower bound part \Gamma(\zeta) of the estimate, which corresponds to the estimate based on the first moment method, undergoes a phase transition at three different thresholds, namely n_{\text{inf,1}}=\sigma^2\log p, which is information theoretic bound for recovering \beta^* when k=1 and \sigma is large, then at n^{*} and finally at n_{\text{LASSO/CS}}. c) We establish a certain Overlap Gap Property (OGP) on the space of all binary vectors \beta when n\le ck\log p for sufficiently small constant c. We conjecture that OGP is the source of algorithmic hardness of solving the minimization problem \min_{\beta}\|Y-X\beta\|_{2} in the regime n

David Gamarnik | Ilias Zadik | D. Gamarnik | Ilias Zadik

[1] David L. Donoho,et al. Counting the Faces of Randomly-Projected Hypercubes and Orthants, with Applications , 2008, Discret. Comput. Geom..

[2] N. Meinshausen,et al. High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[3] Sara van de Geer,et al. Confidence sets in sparse regression , 2012, 1209.1508.

[4] Trevor Hastie,et al. Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[5] Galen Reeves,et al. Approximate Sparsity Pattern Recovery: Information-Theoretic Lower Bounds , 2010, IEEE Transactions on Information Theory.

[6] Martin J. Wainwright,et al. Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting , 2009, IEEE Trans. Inf. Theory.

[7] Madhu Sudan,et al. Performance of Sequential Local Algorithms for the Random NAE-K-SAT Problem , 2017, SIAM J. Comput..

[8] 慧廣瀬. A Mathematical Introduction to Compressive Sensing , 2015 .

[9] Madhu Sudan,et al. Performance of the Survey Propagation-guided decimation algorithm for the random NAE-K-SAT problem , 2014, ArXiv.

[10] S. Rangan,et al. Orthogonal Matching Pursuit from Noisy Measurements : A New Analysis ∗ , 2009 .

[11] M. West,et al. High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics , 2008, Journal of the American Statistical Association.

[12] Amin Coja-Oghlan,et al. On independent sets in random graphs , 2010, SODA '11.

[13] T. Cai,et al. Accuracy assessment for high-dimensional linear regression , 2016, The Annals of Statistics.

[14] E. George. The Variable Selection Problem , 2000 .

[15] J. Boutros,et al. Euclidean space lattice decoding for joint detection in CDMA systems , 1999, Proceedings of the 1999 IEEE Information Theory and Communications Workshop (Cat. No. 99EX253).

[16] Jess Banks,et al. Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).

[17] Andrea Montanari,et al. Reconstruction and Clustering in Random Constraint Satisfaction Problems , 2011, SIAM J. Discret. Math..

[18] Martin J. Wainwright,et al. Information-Theoretic Limits on Sparse Signal Recovery: Dense versus Sparse Measurement Matrices , 2008, IEEE Transactions on Information Theory.

[19] Hao Ling,et al. Joint time-frequency analysis for radar signal and image processing , 1999, IEEE Signal Process. Mag..

[20] Amin Coja-Oghlan,et al. Algorithmic Barriers from Phase Transitions , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[21] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[22] James B. Brown,et al. An overview of recent developments in genomics and associated statistical methods , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[23] Xing Qiu,et al. Detecting intergene correlation changes in microarray analysis: a new approach to gene selection , 2009, BMC Bioinformatics.

[24] D. Donoho,et al. Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[25] David Gamarnik,et al. High Dimensional Linear Regression using Lattice Basis Reduction , 2018, NeurIPS.

[26] Ping Zhang. Model Selection Via Multifold Cross Validation , 1993 .

[27] Bálint Virág,et al. Local algorithms for independent sets are half-optimal , 2014, ArXiv.

[28] Jianqing Fan,et al. A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[29] Bart P. G. Van Parys,et al. Sparse high-dimensional regression: Exact scalable algorithms and phase transitions , 2017, The Annals of Statistics.

[30] Babak Hassibi,et al. On the expected complexity of integer least-squares problems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31] Peng Zhao,et al. On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[32] T. Cai,et al. Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings , 2013 .

[33] Madhu Sudan,et al. Limits of local algorithms over sparse random graphs , 2013, ITCS.

[34] Andrea Montanari,et al. The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, ISIT.

[35] Guangjie Han,et al. Consensus-based sparse signal reconstruction algorithm for wireless sensor networks , 2016, Int. J. Distributed Sens. Networks.

[36] Stephen P. Boyd,et al. Integer parameter estimation in linear models with applications to GPS , 1998, IEEE Trans. Signal Process..

[37] D. L. Donoho,et al. Compressed sensing , 2006, IEEE Trans. Inf. Theory.

[38] Lucas Janson,et al. EigenPrism: inference for high dimensional signal‐to‐noise ratios , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[39] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[40] Xing Qiu,et al. A new gene selection procedure based on the covariance distance , 2010, Bioinform..

[41] Thomas M. Cover,et al. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[42] David Gamarnik,et al. Finding a large submatrix of a Gaussian random matrix , 2016, The Annals of Statistics.

[43] Volkan Cevher,et al. Limits on Support Recovery With Probabilistic Models: An Information-Theoretic Framework , 2015, IEEE Transactions on Information Theory.

[44] Arian Maleki,et al. Does $\ell _{p}$ -Minimization Outperform $\ell _{1}$ -Minimization? , 2015, IEEE Transactions on Information Theory.

[45] Zheng Bao,et al. A new feature vector using selected bispectra for signal classification with application in radar target recognition , 2001, IEEE Trans. Signal Process..

[46] Adam Shwartz,et al. Large Deviations For Performance Analysis , 2019 .

[47] M. Lustig,et al. Compressed Sensing MRI , 2008, IEEE Signal Processing Magazine.

[48] Galen Reeves,et al. The Sampling Rate-Distortion Tradeoff for Sparsity Pattern Recovery in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[49] Michael I. Jordan,et al. Union support recovery in high-dimensional multivariate regression , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[50] Martin J. Wainwright,et al. Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[51] Andrea Montanari,et al. The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[52] Michele Zorzi,et al. Sensing, Compression, and Recovery for WSNs: Sparse Signal Modeling and Monitoring Framework , 2012, IEEE Transactions on Wireless Communications.

[53] Amin Coja-Oghlan,et al. On the solution‐space geometry of random constraint satisfaction problems , 2011, Random Struct. Algorithms.