Trace Reconstruction with Bounded Edit Distance

The trace reconstruction problem studies the number of noisy samples needed to recover an unknown string $\mathrm{x} \in \{0,1\}^{n}$ with high probability, where the samples are independently obtained by passing x through a random deletion channel with deletion probability $q$. The problem is receiving significant attention recently due to its applications in DNA sequencing and DNA storage. Yet, there is still an exponential gap between upper and lower bounds for the trace reconstruction problem. In this paper we study the trace reconstruction problem when x is confined to an edit distance ball of radius $k$, which is essentially equivalent to distinguishing two strings with edit distance at most $k$. It is shown that $n^{O(k)}$ samples suffice to achieve this task with high probability.

[1]  Madhu Sudan,et al.  Limitations of Mean-Based Algorithms for Trace Reconstruction at Small Distance , 2020, 2021 IEEE International Symposium on Information Theory (ISIT).

[2]  Akshay Krishnamurthy,et al.  Trace Reconstruction: Generalized and Parameterized , 2019, ESA.

[3]  Yuval Peres,et al.  Average-Case Reconstruction for the Deletion Channel: Subpolynomially Many Traces Suffice , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[4]  Vladimir I. Levenshtein,et al.  Efficient reconstruction of sequences , 2001, IEEE Trans. Inf. Theory.

[5]  Cyrus Rashtchian,et al.  Approximate Trace Reconstruction , 2020, ArXiv.

[6]  Rocco A. Servedio,et al.  Beyond Trace Reconstruction: Population Recovery from the Deletion Channel , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[7]  Olgica Milenkovic,et al.  Coded Trace Reconstruction , 2019, 2019 IEEE Information Theory Workshop (ITW).

[8]  Cyrus Rashtchian,et al.  Trace Reconstruction Problems in Computational Biology , 2020, ArXiv.

[9]  Zachary Chase New lower bounds for trace reconstruction , 2021 .

[10]  Rocco A. Servedio,et al.  Polynomial-time trace reconstruction in the low deletion rate regime , 2020, ArXiv.

[11]  Cyrus Rashtchian,et al.  Scaling up DNA data storage and random access retrieval , 2017, bioRxiv.

[12]  Ilia Krasikov,et al.  On a Reconstruction Problem for Sequences, , 1997, J. Comb. Theory, Ser. A.

[13]  Sampath Kannan,et al.  Reconstructing strings from random traces , 2004, SODA '04.

[14]  Tamás Erdélyi,et al.  LITTLEWOOD-TYPE PROBLEMS ON SUBARCS OF THE UNIT CIRCLE , 1997 .

[15]  Yuval Peres,et al.  Trace reconstruction with exp(O(n1/3)) samples , 2017, STOC.

[16]  Peter Borwein,et al.  Computational Excursions in Analysis and Number Theory , 2002 .

[17]  Zachary Chase New Upper Bounds for Trace Reconstruction , 2020, ArXiv.

[18]  Rina Panigrahy,et al.  Trace reconstruction with constant deletion probability and related results , 2008, SODA '08.

[19]  John Michael Robson Separating Strings with Small Automata , 1989, Inf. Process. Lett..

[20]  Ryan O'Donnell,et al.  Optimal mean-based algorithms for trace reconstruction , 2017, STOC.

[21]  Melanie Kappelmann-Fenzl Reference Genome , 2021, Next Generation Sequencing and Data Analysis.

[22]  Jehoshua Bruck,et al.  Optimal k-Deletion Correcting Codes , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[23]  Yuval Peres,et al.  Subpolynomial trace reconstruction for random strings and arbitrary deletion probability , 2018, COLT.

[24]  Tamás Erdélyi,et al.  Littlewood‐Type Problems on [0,1] , 1999 .