Low distortion embeddings for edit distance

We show that {0, 1}<sup><i>d</i></sup> endowed with edit distance embeds into ℓ<sub>1</sub> with distortion 2<sup><i>O</i></sup>(&sqrt;log <i>d</i> log log <i>d</i>). We further show efficient implementation of the embedding that yield solutions to various computational problems involving edit distance. These include sketching, communication complexity, nearest neighbor search. For all these problems, we improve upon previous bounds.

[1]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[2]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[3]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[4]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[5]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[6]  Uzi Vishkin,et al.  Communication complexity of document exchange , 1999, SODA '00.

[7]  S. Muthukrishnan,et al.  Approximate nearest neighbors and sequence comparison with block operations , 2000, STOC '00.

[8]  Graham Cormode,et al.  The string edit distance matching problem with moves , 2002, SODA '02.

[9]  Alexandr Andoni,et al.  Lower bounds for embedding edit distance into normed spaces , 2003, SODA '03.

[10]  Ronitt Rubinfeld,et al.  A sublinear algorithm for weakly approximating edit distance , 2003, STOC '03.

[11]  Robert Krauthgamer,et al.  Approximating edit distance efficiently , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[12]  Piotr Indyk,et al.  Approximate Nearest Neighbor under edit distance via product metrics , 2004, SODA '04.

[13]  Rafail Ostrovsky,et al.  Low distortion embeddings for edit distance , 2005, STOC '05.

[14]  Subhash Khot,et al.  Nonembeddability theorems via Fourier analysis , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[15]  Y. Rabani,et al.  Improved lower bounds for embeddings into L 1 , 2006, SODA 2006.

[16]  Dan Suciu,et al.  Journal of the ACM , 2006 .