The hybrid k-deck problem: Reconstructing sequences from short and long traces

We introduce a new variant of the k-deck problem, which in its traditional formulation asks for determining the smallest k that allows one to reconstruct any binary sequence of length n from the multiset of its k-length subsequences. In our version of the problem, termed the hybrid k-deck problem, one is given a certain number of special subsequences of the sequence of length n − t, t > 0, and the question of interest is to determine the smallest value of k such that the k-deck, along with the subsequences, allows for reconstructing the original sequence in an error-free manner. We first consider the case that one is given a single subsequence of the sequence of length n − t, obtained by deleting zeros only, and seek the value of k that allows for hybrid reconstruction. We prove that in this case, k ∊ [log t + 2, min{t + 1, O(√n)}]. We then proceed to extend the single-subsequence setup to the case where one is given M subsequences of length n − t obtained by deleting zeroes only. In this case, we first aggregate the asymmetric traces and then invoke the single-trace results. The analysis and problem at hand are motivated by nanopore sequencing problems for DNA-based data storage.

[1]  Olgica Milenkovic,et al.  Portable and Error-Free DNA-Based Data Storage , 2016, Scientific Reports.

[2]  Robert L. Hemminger,et al.  Graph reconstruction - a survey , 1977, J. Graph Theory.

[3]  Miroslav Dudík,et al.  Reconstruction from subsequences , 2003, J. Comb. Theory A.

[4]  Tamás Erdélyi,et al.  Littlewood‐Type Problems on [0,1] , 1999 .

[5]  Sampath Kannan,et al.  Reconstructing strings from random traces , 2004, SODA '04.

[6]  Eitan Yaakobi,et al.  Sequence Reconstruction Over the Deletion Channel , 2018, IEEE Transactions on Information Theory.

[7]  Alex D. Scott,et al.  Reconstructing sequences , 1997, Discret. Math..

[8]  Paul K. Stockmeyer,et al.  Reconstruction of sequences , 1991, Discret. Math..

[9]  V. K. Leont'ev,et al.  On a non-classical recognition problem , 1984 .

[10]  Tero Harju,et al.  Combinatorics on Words , 2004 .

[11]  Frederic Sala,et al.  Exact Reconstruction From Insertions in Synchronization Codes , 2016, IEEE Transactions on Information Theory.

[12]  Ron M. Roth,et al.  Introduction to Coding Theory , 2019, Discrete Mathematics.

[13]  E. Mardis The impact of next-generation sequencing technology on genetics. , 2008, Trends in genetics : TIG.

[14]  Ilia Krasikov,et al.  On a Reconstruction Problem for Sequences, , 1997, J. Comb. Theory A.

[15]  Alon Orlitsky,et al.  String Reconstruction from Substring Compositions , 2014, SIAM J. Discret. Math..

[16]  N.J.A. Sloane,et al.  On Single-Deletion-Correcting Codes , 2002, math/0207197.