High Dimensional Linear Regression using Lattice Basis Reduction

We consider a high dimensional linear regression problem where the goal is to efficiently recover an unknown vector $\beta^*$ from $n$ noisy linear observations $Y=X\beta^*+W \in \mathbb{R}^n$, for known $X \in \mathbb{R}^{n \times p}$ and unknown $W \in \mathbb{R}^n$. Unlike most of the literature on this model we make no sparsity assumption on $\beta^*$. Instead we adopt a regularization based on assuming that the underlying vectors $\beta^*$ have rational entries with the same denominator $Q \in \mathbb{Z}_{>0}$. We call this $Q$-rationality assumption. We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm. We establish that under the $Q$-rationality assumption, our algorithm recovers exactly the vector $\beta^*$ for a large class of distributions for the iid entries of $X$ and non-zero noise $W$. We prove that it is successful under small noise, even when the learner has access to only one observation ($n=1$). Furthermore, we prove that in the case of the Gaussian white noise for $W$, $n=o\left(p/\log p\right)$ and $Q$ sufficiently large, our algorithm tolerates a nearly optimal information-theoretic level of the noise.

[1]  D. Donoho,et al.  Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[2]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[3]  David Gamarnik,et al.  Sparse High-Dimensional Linear Regression. Algorithmic Barriers and a Local Search Algorithm , 2017, 1711.04952.

[4]  Yonina C. Eldar,et al.  Phase Retrieval via Matrix Completion , 2011, SIAM Rev..

[5]  Abraham Lempel,et al.  Cryptology in Transition , 1979, CSUR.

[6]  Adel Javanmard,et al.  Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing , 2011, IEEE Transactions on Information Theory.

[7]  Alan M. Frieze,et al.  On the Lagarias-Odlyzko Algorithm for the Subset Sum Problem , 1986, SIAM J. Comput..

[8]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[9]  David L. Donoho,et al.  Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[10]  Paul Erdös,et al.  On the probability that n. and g(n) are relatively prime , 1959 .

[11]  Florent Krzakala,et al.  Statistical physics-based reconstruction in compressed sensing , 2011, ArXiv.

[12]  László Lovász,et al.  Factoring polynomials with rational coefficients , 1982 .

[13]  Alexandros G. Dimakis,et al.  Compressed Sensing using Generative Models , 2017, ICML.

[14]  Adi Shamir,et al.  A polynomial time algorithm for breaking the basic Merkle-Hellman cryptosystem , 1984, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[15]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[16]  Holger Rauhut,et al.  A Mathematical Introduction to Compressive Sensing , 2013, Applied and Numerical Harmonic Analysis.

[17]  Mazen Al Borno,et al.  Reduction in Solving Some Integer Least Squares Problems , 2011, ArXiv.

[18]  Babak Hassibi,et al.  On the expected complexity of integer least-squares problems , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[20]  Stephen P. Boyd,et al.  Integer parameter estimation in linear models with applications to GPS , 1998, IEEE Trans. Signal Process..

[21]  R. Goodstein,et al.  An introduction to the theory of numbers , 1961 .

[22]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[23]  D. Donoho,et al.  Neighborliness of randomly projected simplices in high dimensions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[24]  J. Boutros,et al.  Euclidean space lattice decoding for joint detection in CDMA systems , 1999, Proceedings of the 1999 IEEE Information Theory and Communications Workshop (Cat. No. 99EX253).

[25]  Jeffrey C. Lagarias,et al.  Solving low density subset sum problems , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[26]  Martin E. Hellman,et al.  Hiding information and signatures in trapdoor knapsacks , 1978, IEEE Trans. Inf. Theory.

[27]  E. T. An Introduction to the Theory of Numbers , 1946, Nature.