Efficient Linear and Affine Codes for Correcting Insertions/Deletions

This paper studies \emph{linear} and \emph{affine} error-correcting codes for correcting synchronization errors such as insertions and deletions. We call such codes linear/affine insdel codes. Linear codes that can correct even a single deletion are limited to have information rate at most $1/2$ (achieved by the trivial 2-fold repetition code). Previously, it was (erroneously) reported that more generally no non-trivial linear codes correcting $k$ deletions exist, i.e., that the $(k+1)$-fold repetition codes and its rate of $1/(k+1)$ are basically optimal for any $k$. We disprove this and show the existence of binary linear codes of length $n$ and rate just below $1/2$ capable of correcting $\Omega(n)$ insertions and deletions. This identifies rate $1/2$ as a sharp threshold for recovery from deletions for linear codes, and reopens the quest for a better understanding of the capabilities of linear codes for correcting insertions/deletions. We prove novel outer bounds and existential inner bounds for the rate vs. (edit) distance trade-off of linear insdel codes. We complement our existential results with an efficient synchronization-string-based transformation that converts any asymptotically-good linear code for Hamming errors into an asymptotically-good linear code for insdel errors. Lastly, we show that the $\frac{1}{2}$-rate limitation does not hold for affine codes by giving an explicit affine code of rate $1-\epsilon$ which can efficiently correct a constant fraction of insdel errors.

[1]  Khaled A. S. Abdel-Ghaffar,et al.  On Linear and Cyclic Codes for Correcting Deletions , 2007, 2007 IEEE International Symposium on Information Theory.

[2]  Bernhard Haeupler,et al.  Synchronization Strings: Channel Simulations and Interactive Coding for Insertions and Deletions , 2017, ICALP.

[3]  Antonia Wachter-Zeh,et al.  List Decoding of Insertions and Deletions , 2017, IEEE Transactions on Information Theory.

[4]  Kuan Cheng,et al.  Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets , 2017, ArXiv.

[5]  Bernhard Haeupler Optimal Document Exchange and New Codes for Insertions and Deletions , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[6]  Jehoshua Bruck,et al.  Optimal k-Deletion Correcting Codes , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[7]  Bernhard Haeupler,et al.  Synchronization strings: codes for insertions and deletions approaching the Singleton bound , 2017, STOC.

[8]  Madhu Sudan,et al.  Synchronization Strings: List Decoding for Insertions and Deletions , 2018, ICALP.

[9]  Bernhard Haeupler,et al.  Synchronization strings: explicit constructions, local decoding, and applications , 2017, STOC.

[10]  Venkatesan Guruswami,et al.  An Improved Bound on the Fraction of Correctable Deletions , 2015, IEEE Transactions on Information Theory.

[11]  Bernhard Haeupler,et al.  Near-linear time insertion-deletion codes and (1+ε)-approximating edit distance via indexing , 2018, STOC.

[12]  Zhengzhong Jin,et al.  Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[13]  Daniel A. Spielman,et al.  Linear-time encodable and decodable error-correcting codes , 1995, STOC '95.

[14]  Zhengzhong Jin,et al.  Block Edit Errors with Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes , 2018, ICALP.

[15]  Venkatesan Guruswami,et al.  Efficient Low-Redundancy Codes for Correcting Multiple Deletions , 2015, IEEE Transactions on Information Theory.

[16]  Noga Alon,et al.  Linear time erasure codes with nearly optimal recovery , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[17]  Kenji Yasunaga,et al.  On the List Decodability of Insertions and Deletions , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[18]  Venkatesan Guruswami,et al.  Optimally resilient codes for list-decoding from insertions and deletions , 2019, Electron. Colloquium Comput. Complex..

[19]  Kenneth W. Shum,et al.  A low-complexity algorithm for the construction of algebraic-geometric codes better than the Gilbert-Varshamov bound , 2001, IEEE Trans. Inf. Theory.

[20]  Venkatesan Guruswami,et al.  Deletion Codes in the High-Noise and High-Rate Regimes , 2014, IEEE Transactions on Information Theory.

[21]  David Zuckerman,et al.  Asymptotically good codes correcting insertions, deletions, and transpositions , 1997, SODA '97.