Edit Errors with Block Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes

Document exchange and error correcting codes are two fundamental problems regarding communications. In both problems, an upper bound is placed on the number of errors between the two strings or that the channel can add, and a major goal is to minimize the size of the sketch or the redundant information. In this paper we focus on deterministic document exchange protocols and binary error correcting codes. In a recent work \cite{CJLW18}, the authors constructed explicit deterministic document exchange protocols and binary error correcting codes for edit errors with almost optimal parameters. Unfortunately, the constructions in \cite{CJLW18} do not work for other common errors such as block transpositions. In this paper, we generalize the constructions in \cite{CJLW18} to handle a much larger class of errors. Specifically, we consider document exchange and error correcting codes where the total number of block insertions, block deletions, and block transpositions is at most $k \leq \alpha n/\log n$ for some constant $0<\alpha<1$. In addition, the total number of bits inserted and deleted by the first two kinds of operations is at most $t \leq \beta n$ for some constant $0<\beta<1$, where $n$ is the length of Alice's string or message. We construct explicit, deterministic document exchange protocols with sketch size $O(k \log^2 n+t)$ and explicit binary error correcting code with $O(k \log n \log \log n+t)$ redundant bits. As a comparison, the information-theoretic optimum for both problems is $\Theta(k \log n+t)$. As far as we know, previously there are no known explicit deterministic document exchange protocols in this case, and the best known binary code needs $\Omega(n)$ redundant bits even to correct just \emph{one} block transposition.

[1]  Torsten Suel,et al.  Improved single-round protocols for remote file synchronization , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[2]  Uzi Vishkin,et al.  Communication complexity of document exchange , 1999, SODA '00.

[3]  Noga Alon,et al.  Simple Construction of Almost k-wise Independent Random Variables , 1992, Random Struct. Algorithms.

[4]  Alon Orlitsky,et al.  Interactive communication: balanced distributions, correlated files, and average-case complexity , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[5]  Hossein Jowhari,et al.  Efficient Communication Protocols for Deciding Edit Distance , 2012, ESA.

[6]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[7]  Djamal Belazzougui,et al.  Efficient Deterministic Single Round Document Exchange for Edit Distance , 2015, ArXiv.

[8]  M. Luby,et al.  Asymptotically Good Codes Correcting Insertions, Deletions, and Transpositions , 1999 .

[9]  Qin Zhang,et al.  Edit Distance: Sketching, Streaming, and Document Exchange , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Zhengzhong Jin,et al.  Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[11]  Michal Koucký,et al.  Low Distortion Embedding from Edit to Hamming Distance using Coupling , 2015, Electron. Colloquium Comput. Complex..

[12]  Ian F. Blake,et al.  Algebraic-Geometry Codes , 1998, IEEE Trans. Inf. Theory.

[13]  Graham Cormode,et al.  The string edit distance matching problem with moves , 2007, TALG.

[14]  V. Guruswami,et al.  Efficient low-redundancy codes for correcting multiple deletions , 2016, SODA 2016.

[15]  Dana Shapira,et al.  Edit distance with move operations , 2002, J. Discrete Algorithms.