The zero-rate threshold for adversarial bit-deletions is less than 1/2

We prove that there exists an absolute constant δ > 0 such any binary code C ⊂ {0,1}N tolerating (1/2−δ )N adversarial deletions must satisfy |C|6 2poly logN and thus have rate asymptotically approaching 0. This is the first constant fraction improvement over the trivial bound that codes tolerating N/2 adversarial deletions must have rate going to 0 asymptotically. Equivalently, we show that there exists absolute constants A and δ > 0 such that any set C ⊂ {0,1}N of 2log N binary strings must contain two strings c and c′ whose longest common subsequence has length at least (1/2+ δ )N. As an immediate corollary, we show that q-ary codes tolerating a fraction 1− (1+ 2δ )/q of adversarial deletions must also have rate approaching 0. Our techniques include string regularity arguments and a structural lemma that classifies binary strings by their oscillation patterns. Leveraging these tools, we find in any large code two strings with similar oscillation patterns, which is exploited to find a long common subsequence.

[1]  Bernhard Haeupler Optimal Document Exchange and New Codes for Insertions and Deletions , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[2]  Jeffrey D. Ullman,et al.  On the capabilities of codes to correct synchronization errors , 1967, IEEE Trans. Inf. Theory.

[3]  Bernhard Haeupler,et al.  Synchronization strings: explicit constructions, local decoding, and applications , 2017, STOC.

[4]  Venkatesan Guruswami,et al.  Sharp threshold rates for random codes , 2020, ITCS.

[5]  Jehoshua Bruck,et al.  Syndrome Compression for Optimal Redundancy Codes , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[6]  Boris Bukh,et al.  Twins in words and long common subsequences in permutations , 2013, 1307.0088.

[7]  Amirbehshad Shahrasbi,et al.  Synchronization Strings and Codes for Insertions and Deletions—A Survey , 2021, IEEE Transactions on Information Theory.

[8]  Maria Axenovich,et al.  A regularity lemma and twins in words , 2012, J. Comb. Theory, Ser. A.

[9]  Venkatesan Guruswami,et al.  An Improved Bound on the Fraction of Correctable Deletions , 2015, IEEE Transactions on Information Theory.

[10]  Venkatesan Guruswami,et al.  Deletion Codes in the High-Noise and High-Rate Regimes , 2014, IEEE Transactions on Information Theory.

[11]  Venkatesan Guruswami,et al.  Coding against deletions in oblivious and online models , 2018, SODA.

[12]  V. Chvátal,et al.  Longest common subsequences of two random sequences , 1975, Advances in Applied Probability.

[13]  David Zuckerman,et al.  Asymptotically good codes correcting insertions, deletions, and transpositions , 1997, SODA '97.

[14]  George S. Lueker,et al.  Improved bounds on the average length of longest common subsequences , 2003, JACM.

[15]  Venkatesan Guruswami,et al.  Optimally resilient codes for list-decoding from insertions and deletions , 2019, Electron. Colloquium Comput. Complex..

[16]  Jehoshua Bruck,et al.  Optimal k-Deletion Correcting Codes , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[17]  Jirí Matousek,et al.  Expected Length of the Longest Common Subsequence for Large Alphabets , 2003, LATIN.

[18]  Venkatesan Guruswami,et al.  A Lower Bound on List Size for List Decoding , 2005, IEEE Trans. Inf. Theory.

[19]  Mahdi Cheraghchi,et al.  An Overview of Capacity Results for Synchronization Channels , 2019, ArXiv.

[20]  Zhengzhong Jin,et al.  Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[22]  Michael Mitzenmacher,et al.  On the zero-error capacity threshold for deletion channels , 2011, 2011 Information Theory and Applications Workshop.

[23]  Sergey V. Avgustinovich,et al.  On abelian saturated infinite words , 2019, Theor. Comput. Sci..

[24]  Jie Ma,et al.  Longest Common Subsequences in Sets of Words , 2014, SIAM J. Discret. Math..

[25]  Venkatesan Guruswami,et al.  Efficiently decodable insertion/deletion codes for high-noise and high-rate regimes , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[26]  Michael Mitzenmacher,et al.  A Survey of Results for Deletion Channels and Related Synchronization Channels , 2008, SWAT.

[27]  Sidharth Jaggi,et al.  Generalized List Decoding , 2019, 2020 Information Theory and Applications Workshop (ITA).

[28]  Venkatesan Guruswami,et al.  Efficient Low-Redundancy Codes for Correcting Multiple Deletions , 2015, IEEE Transactions on Information Theory.