Subpolynomial trace reconstruction for random strings and arbitrary deletion probability

The insertion-deletion channel takes as input a bit string ${\bf x}\in\{0,1\}^{n}$, and outputs a string where bits have been deleted and inserted independently at random. The trace reconstruction problem is to recover $\bf x$ from many independent outputs (called "traces") of the insertion-deletion channel applied to $\bf x$. We show that if $\bf x$ is chosen uniformly at random, then $\exp(O(\log^{1/3} n))$ traces suffice to reconstruct $\bf x$ with high probability. For the deletion channel with deletion probability $q < 1/2$ the earlier upper bound was $\exp(O(\log^{1/2} n))$. The case of $q\geq 1/2$ or the case where insertions are allowed has not been previously analyzed, and therefore the earlier upper bound was as for worst-case strings, i.e., $\exp(O( n^{1/3}))$. We also show that our reconstruction algorithm runs in $n^{1+o(1)}$ time. A key ingredient in our proof is a delicate two-step alignment procedure where we estimate the location in each trace corresponding to a given bit of $\bf x$. The alignment is done by viewing the strings as random walks and comparing the increments in the walk associated with the input string and the trace, respectively.

[1]  Yuval Peres,et al.  Trace reconstruction with exp(O(n1/3)) samples , 2017, STOC.

[2]  Michael Mitzenmacher,et al.  A Survey of Results for Deletion Channels and Related Synchronization Channels , 2008, SWAT.

[3]  Sampath Kannan,et al.  Reconstructing strings from random traces , 2004, SODA '04.

[4]  T. E. Harris A lower bound for the critical probability in a certain percolation process , 1960, Mathematical Proceedings of the Cambridge Philosophical Society.

[5]  Sampath Kannan,et al.  More on reconstructing strings from random traces: insertions and deletions , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[6]  Krishnamurthy Viswanathan,et al.  Improved string reconstruction over insertion-deletion channels , 2008, SODA '08.

[7]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, STOC '84.

[8]  H. Poincaré,et al.  Percolation ? , 1982 .

[9]  Rina Panigrahy,et al.  Trace reconstruction with constant deletion probability and related results , 2008, SODA '08.

[10]  Yuval Peres,et al.  Average-Case Reconstruction for the Deletion Channel: Subpolynomially Many Traces Suffice , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[11]  Vladimir I. Levenshtein,et al.  Efficient reconstruction of sequences , 2001, IEEE Trans. Inf. Theory.

[12]  Ryan O'Donnell,et al.  Optimal mean-based algorithms for trace reconstruction , 2017, STOC.

[13]  Russell Lyons,et al.  Lower bounds for trace reconstruction , 2018, ArXiv.

[14]  Sofya Vorotnikova,et al.  Trace Reconstruction Revisited , 2014, ESA.

[15]  Kazuoki Azuma WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .

[16]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[17]  Vladimir I. Levenshtein,et al.  Efficient Reconstruction of Sequences from Their Subsequences or Supersequences , 2001, J. Comb. Theory A.

[18]  Tamás Erdélyi,et al.  LITTLEWOOD-TYPE PROBLEMS ON SUBARCS OF THE UNIT CIRCLE , 1997 .

[19]  Zachary Chase New Upper Bounds for Trace Reconstruction , 2020, ArXiv.