Efficient Sketches for Earth-Mover Distance, with Applications

We provide the first sub-linear sketching algorithm for estimating the planar Earth-Mover Distance with a constant approximation. For sets living in the two-dimensional grid $[\Delta]^2$, we achieve space$\Delta^{\eps}$ for approximation $O(1/\eps)$, for any desired $0

[1]  T. S. Jayram,et al.  OPEN PROBLEMS IN DATA STREAMS AND RELATED TOPICS IITK WORKSHOP ON ALGORITHMS FOR DATA STREAMS ’06 , 2007 .

[2]  Stefan Heinrich,et al.  Ultraproducts in Banach space theory. , 1980 .

[3]  Noam Nisan,et al.  Pseudorandom generators for space-bounded computations , 1990, STOC '90.

[4]  Robert Krauthgamer,et al.  Approximate classification via earthmover metrics , 2004, SODA '04.

[5]  Piotr Indyk,et al.  A near linear time constant factor approximation for Euclidean bichromatic matching (cost) , 2007, SODA '07.

[6]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[7]  M. Talagrand Embedding subspaces of L1 into l1N , 1990 .

[8]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[9]  Eugene Lawler,et al.  Combinatorial optimization , 1976 .

[10]  Noam Nisan,et al.  Pseudorandom generators for space-bounded computation , 1992, Comb..

[11]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  J. Cooper SINGULAR INTEGRALS AND DIFFERENTIABILITY PROPERTIES OF FUNCTIONS , 1973 .

[13]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[14]  Nigel J. Kalton,et al.  Banach spaces embedding intoL0 , 1985 .

[15]  David P. Woodruff,et al.  The Data Stream Space Complexity of Cascaded Norms , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[16]  M. Ribe,et al.  On uniformly homeomorphic normed spaces , 1976 .

[17]  Pankaj K. Agarwal,et al.  A near-linear constant-factor approximation for euclidean bipartite matching? , 2004, SCG '04.

[18]  J. Bourgain,et al.  Remarks on the extension of lipschitz maps defined on discrete sets and uniform homeomorphisms , 1987 .

[19]  Pankaj K. Agarwal,et al.  Approximation algorithms for bipartite and non-bipartite matching in the plane , 1999, SODA '99.

[20]  Subhash Khot,et al.  Nonembeddability theorems via Fourier analysis , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[21]  James R. Lee,et al.  Euclidean distortion and the sparsest cut , 2005, STOC '05.

[22]  Jean Bourgain,et al.  Canonical Sobolev projections of weak type (1,1) , 2001 .

[23]  Jirí Matousek,et al.  Low-Distortion Embeddings of Finite Metric Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[24]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[25]  L. Hörmander,et al.  Estimates for translation invariant operators inLp spaces , 1960 .

[26]  Joseph O'Rourke,et al.  Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[27]  K. Leeuw On L p Multipliers , 1965 .

[28]  Joram Lindenstrauss,et al.  Classical Banach spaces , 1973 .

[29]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[30]  Piotr Indyk,et al.  Stable distributions, pseudorandom generators, embeddings, and data stream computation , 2006, JACM.

[31]  Gideon Schechtman,et al.  Planar Earthmover is not in L_1 , 2005, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[32]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[33]  David P. Woodruff,et al.  Optimal approximations of the frequency moments of data streams , 2005, STOC '05.

[34]  O. Christensen The L p -spaces , 2010 .

[35]  Jiri Matousek,et al.  Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[36]  Shizuo Kakutani,et al.  Concrete Representation of Abstract (L)-Spaces and the Mean Ergodic Theorem , 1941 .

[37]  B. M. Fulk MATH , 1992 .

[38]  J. Lindenstrauss,et al.  Geometric Nonlinear Functional Analysis , 1999 .

[39]  Piotr Indyk,et al.  Nearest-neighbor-preserving embeddings , 2007, TALG.

[40]  Piotr Indyk,et al.  Algorithmic applications of low-distortion geometric embeddings , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[41]  Pravin M. Vaidya,et al.  Geometry helps in matching , 1989, STOC '88.

[42]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[43]  R. Cooke Real and Complex Analysis , 2011 .

[44]  Assaf Naor,et al.  Metric cotype , 2005, SODA '06.

[45]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[46]  M. Talagrand,et al.  An “isomorphic” version of the sauer-shelah lemma and the banach-mazur distance to the cube , 1989 .

[47]  Stefan Heinrich,et al.  Applications of ultrapowers to the uniform and Lipschitz classification of Banach spaces , 1982 .

[48]  Micha Sharir,et al.  Vertical decomposition of shallow levels in 3-dimensional arrangements and its applications , 1995, SCG '95.

[49]  A. Giannopoulos A NOTE ON THE BANACH-MAZUR DISTANCE TO THE CUBE , 1995 .

[50]  Graham Cormode,et al.  An Improved Data Stream Summary: The Count-Min Sketch and Its Applications , 2004, LATIN.

[51]  Jean Bourgain,et al.  The Banach-Mazur distance to the cube and the Dvoretzky-Rogers factorization , 1988 .

[52]  P. Wojtaszczyk Banach Spaces For Analysts: Preface , 1991 .

[53]  S. V. Kislyakov,et al.  Sobolev imbedding operators and the nonisomorphism of certain Banach spaces , 1975 .

[54]  Michael Werman,et al.  A Unified Approach to the Change of Resolution: Space and Gray-Level , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Nisheeth K. Vishnoi,et al.  The Unique Games Conjecture, Integrality Gap for Cut Problems and Embeddability of Negative Type Metrics into l1 , 2005, FOCS.

[56]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[57]  Daniel N. Rockmore,et al.  Efficient computation of Fourier inversion for finite groups , 1994, JACM.

[58]  S. Janson Stable distributions , 2011, 1112.0220.

[59]  M. Talagrand Embedding Subspaces of L 1 into l N 1 , 1990 .

[60]  Piotr Indyk,et al.  Algorithms for dynamic geometric problems over data streams , 2004, STOC '04.

[61]  Alexandr Andoni,et al.  Overcoming the l1 non-embeddability barrier: algorithms for product metrics , 2009, SODA.

[62]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[63]  N. Tomczak-Jaegermann Banach-Mazur distances and finite-dimensional operator ideals , 1989 .

[64]  Timothy S. Murphy,et al.  Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals , 1993 .

[65]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[66]  W. Johnson,et al.  Characterization of quasi-Banach spaces which coarsely embed into a Hilbert space , 2004, math/0411269.

[67]  B. Mityagin,et al.  Uniform embeddings of metric spaces and of banach spaces into hilbert spaces , 1985 .

[68]  Piotr Indyk Dimensionality reduction techniques for proximity problems , 2000, SODA '00.

[69]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[70]  C. Villani Topics in Optimal Transportation , 2003 .

[71]  Alberto Torchinsky,et al.  Real-Variable Methods in Harmonic Analysis , 1986 .

[72]  Joseph Naor,et al.  Approximation algorithms for the metric labeling problem via a new linear programming formulation , 2001, SODA '01.

[73]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .