Synchronization Strings: List Decoding for Insertions and Deletions

We study codes that are list-decodable under insertions and deletions. Specifically, we consider the setting where a codeword over some finite alphabet of size $q$ may suffer from $\delta$ fraction of adversarial deletions and $\gamma$ fraction of adversarial insertions. A code is said to be $L$-list-decodable if there is an (efficient) algorithm that, given a received word, reports a list of $L$ codewords that include the original codeword. Using the concept of synchronization strings, introduced by the first two authors [STOC 2017], we show some surprising results. We show that for every $0\leq\delta 0$ there exist efficient codes of rate $1-\delta-\epsilon$ and constant alphabet (so $q=O_{\delta,\gamma,\epsilon}(1)$) and sub-logarithmic list sizes. We stress that the fraction of insertions can be arbitrarily large and the rate is independent of this parameter. Our result sheds light on the remarkable asymmetry between the impact of insertions and deletions from the point of view of error-correction: Whereas deletions cost in the rate of the code, insertion costs are borne by the adversary and not the code! We also prove several tight bounds on the parameters of list-decodable insdel codes. In particular, we show that the alphabet size of insdel codes needs to be exponentially large in $\epsilon^{-1}$, where $\epsilon$ is the gap to capacity above. Our result even applies to settings where the unique-decoding capacity equals the list-decoding capacity and when it does so, it shows that the alphabet size needs to be exponentially large in the gap to capacity. This is sharp contrast to the Hamming error model where alphabet size polynomial in $\epsilon^{-1}$ suffices for unique decoding and also shows that the exponential dependence on the alphabet size in previous works that constructed insdel codes is actually necessary!

[1]  Bernhard Haeupler,et al.  Synchronization strings: codes for insertions and deletions approaching the Singleton bound , 2017, STOC.

[2]  David Zuckerman,et al.  Asymptotically good codes correcting insertions, deletions, and transpositions , 1997, SODA '97.

[3]  Vahid Tarokh,et al.  A survey of error-correcting codes for channels with symbol synchronization errors , 2010, IEEE Communications Surveys & Tutorials.

[4]  Brett Hemenway,et al.  Local List Recovery of High-Rate Tensor Codes and Applications , 2017, SIAM J. Comput..

[5]  Venkatesan Guruswami,et al.  Optimal Rate List Decoding via Derivative Codes , 2011, APPROX-RANDOM.

[6]  Brett Hemenway,et al.  Linear-Time List Recovery of High-Rate Expander Codes , 2015, ICALP.

[7]  Venkatesan Guruswami,et al.  Linear-Time List Decoding in Error-Free Settings: (Extended Abstract) , 2004, ICALP.

[8]  M. Mitzenmacher A survey of results for deletion channels and related synchronization channels , 2009 .

[9]  Bernhard Haeupler,et al.  Synchronization Strings: Channel Simulations and Interactive Coding for Insertions and Deletions , 2017, ICALP.

[10]  Bernhard Haeupler,et al.  Synchronization strings: explicit constructions, local decoding, and applications , 2017, STOC.

[11]  Venkatesan Guruswami,et al.  Expander-based constructions of efficiently decodable codes , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[12]  Venkatesan Guruswami,et al.  List decoding reed-solomon, algebraic-geometric, and gabidulin subcodes up to the singleton bound , 2013, STOC '13.

[13]  N.J.A. Sloane,et al.  On Single-Deletion-Correcting Codes , 2002, math/0207197.

[14]  Michael Mitzenmacher,et al.  A Survey of Results for Deletion Channels and Related Synchronization Channels , 2008, SWAT.

[15]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[16]  Venkatesan Guruswami,et al.  Explicit Codes Achieving List Decoding Capacity: Error-Correction With Optimal Redundancy , 2005, IEEE Transactions on Information Theory.

[17]  Venkatesan Guruswami,et al.  Optimal Rate List Decoding over Bounded Alphabets Using Algebraic-geometric Codes , 2017, Electron. Colloquium Comput. Complex..

[18]  Venkatesan Guruswami,et al.  Improved decoding of Reed-Solomon and algebraic-geometric codes , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[19]  Antonia Wachter-Zeh,et al.  List Decoding of Insertions and Deletions , 2017, IEEE Transactions on Information Theory.

[20]  Venkatesan Guruswami,et al.  Linear time encodable and list decodable codes , 2003, STOC '03.

[21]  Venkatesan Guruswami,et al.  Efficiently decodable insertion/deletion codes for high-noise and high-rate regimes , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[22]  Venkatesan Guruswami,et al.  Near-optimal linear-time codes for unique decoding and new list-decodable codes over smaller alphabets , 2002, STOC '02.

[23]  Swastik Kopparty,et al.  List-Decoding Multiplicity Codes , 2012, Theory Comput..

[24]  Venkatesan Guruswami,et al.  Deletion Codes in the High-Noise and High-Rate Regimes , 2014, IEEE Transactions on Information Theory.