Trace Reconstruction: Generalized and Parameterized

In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string $x$ given random "traces" of $x$ where each trace is generated by deleting each coordinate of $x$ independently with probability $p<1$. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. We prove that $\exp(O(n^{1/4} \sqrt{\log n}))$ traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown $\sqrt{n}\times \sqrt{n}$ matrix is deleted independently with probability $p$. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is $\exp(O(n^{1/3}))$. An optimal result for random matrix reconstruction: we show that $\Theta(\log n)$ traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is $\exp({O}(\log^{1/3} n))$. We show that $\exp(O(k^{1/3}\log^{2/3} n))$ traces suffice to reconstruct $k$-sparse strings, providing an improvement over the best known sequence reconstruction results when $k = o(n/\log^2 n)$. We show that $\textrm{poly}(n)$ traces suffice if $x$ is $k$-sparse and we additionally have a "separation" promise, specifically that the indices of 1's in $x$ all differ by $\Omega(k \log n)$.

[1]  Tamás Erdélyi,et al.  LITTLEWOOD-TYPE PROBLEMS ON SUBARCS OF THE UNIT CIRCLE , 1997 .

[2]  Krishnamurthy Viswanathan,et al.  Improved string reconstruction over insertion-deletion channels , 2008, SODA '08.

[3]  Péter Sziklai,et al.  Reconstruction of matrices from submatrices , 2009, Math. Comput..

[4]  Yuval Peres,et al.  Trace reconstruction with varying deletion probabilities , 2018, ANALCO.

[5]  Ilia Krasikov,et al.  On a Reconstruction Problem for Sequences, , 1997, J. Comb. Theory A.

[6]  Mikhail Belkin,et al.  Polynomial Learning of Distribution Families , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[7]  Jerry Li,et al.  Mixture models, robustness, and sum of squares proofs , 2017, STOC.

[8]  Rocco A. Servedio,et al.  Beyond Trace Reconstruction: Population Recovery from the Deletion Channel , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Yuval Peres,et al.  Subpolynomial trace reconstruction for random strings and arbitrary deletion probability , 2018, COLT.

[10]  Olgica Milenkovic,et al.  Coded Trace Reconstruction , 2019, 2019 IEEE Information Theory Workshop (ITW).

[11]  Dimitris Achlioptas,et al.  On Spectral Learning of Mixtures of Distributions , 2005, COLT.

[12]  Rina Panigrahy,et al.  Trace reconstruction with constant deletion probability and related results , 2008, SODA '08.

[13]  Piotr Indyk,et al.  Sparse Recovery Using Sparse Matrices , 2010, Proceedings of the IEEE.

[14]  Russell Lyons,et al.  Lower bounds for trace reconstruction , 2018, ArXiv.

[15]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[16]  Sanjeev Arora,et al.  Learning mixtures of arbitrary gaussians , 2001, STOC '01.

[17]  Sampath Kannan,et al.  More on reconstructing strings from random traces: insertions and deletions , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[18]  Rocco A. Servedio,et al.  Learning mixtures of structured distributions over discrete domains , 2012, SODA.

[19]  Sofya Vorotnikova,et al.  Trace Reconstruction Revisited , 2014, ESA.

[20]  Adam Tauman Kalai,et al.  Efficiently learning mixtures of two Gaussians , 2010, STOC '10.

[21]  Moritz Hardt,et al.  Tight Bounds for Learning a Mixture of Two Gaussians , 2014, STOC.

[22]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[23]  Cyrus Rashtchian,et al.  Reconstructing Trees from Traces , 2019, COLT.

[24]  Sampath Kannan,et al.  Reconstructing strings from random traces , 2004, SODA '04.