Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube

Using geometric techniques like projection and dimensionality reduction, we show that there exists a randomized sub-linear time algorithm that can estimate the Hamming distance between two matrices. Consider two matrices A and B of size n× n whose dimensions are known to the algorithm but the entries are not. The entries of the matrix are real numbers. The access to any matrix is through an oracle that computes the projection of a row (or a column) of the matrix on a vector in {0, 1}. We call this query oracle to be an Inner Product oracle (shortened as IP). We show that our algorithm returns a (1± ) approximation to DM(A,B) with high probability by making O ( n √ DM(A,B) poly ( log n, 1 )) oracle queries, where DM(A,B) denotes the Hamming distance (the number of corresponding entries in which A and B differ) between two matrices A and B of size n× n. We also show a matching lower bound on the number of such IP queries needed. Though our main result is on estimating DM(A,B) using IP, we also compare our results with other query models.

[1]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[2]  Arijit Ghosh,et al.  Inner Product Oracle can Estimate and Sample , 2019, ArXiv.

[3]  Piotr Berman,et al.  Testing convexity of figures under the uniform distribution , 2016, SoCG.

[4]  Sariel Har-Peled,et al.  Jaywalking Your Dog: Computing the Fréchet Distance with Shortcuts , 2012, SIAM J. Comput..

[5]  Micha Sharir,et al.  Hausdorff distance under translation for points and balls , 2003, TALG.

[6]  Artur Czumaj,et al.  Property Testing with Geometric Queries , 2001, ESA.

[7]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[8]  Ronitt Rubinfeld,et al.  Sublinear-time approximation of Euclidean minimum spanning tree , 2003, SODA '03.

[9]  Sariel Har-Peled,et al.  Approximating the Fréchet Distance for Realistic Curves in Near Linear Time , 2010, Discrete & Computational Geometry.

[10]  Artur Czumaj,et al.  Property Testing in Computational Geometry , 2000, ESA.

[11]  Haim Kaplan,et al.  Computing the Discrete Fréchet Distance in Subquadratic Time , 2012, SIAM J. Comput..

[12]  Daniel Keren,et al.  Applying Property Testing to an Image Partitioning Problem , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[14]  Cyrus Rashtchian,et al.  Edge Estimation with Independent Set Oracles , 2017, ITCS.

[15]  A. Razborov Communication Complexity , 2011 .

[16]  Oded Goldreich,et al.  Introduction to Property Testing , 2017 .

[17]  Dana Ron,et al.  Testing Properties of Sparse Images , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[18]  Bernard Chazelle,et al.  Sublinear geometric algorithms , 2003, STOC '03.

[19]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[20]  Piotr Berman,et al.  Tolerant Testers of Image Properties , 2016, ICALP.

[21]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[22]  Sofya Raskhodnikova,et al.  Approximate Testing of Visual Properties , 2003, RANDOM-APPROX.

[23]  Kyle Fox,et al.  Computing the Gromov-Hausdorff Distance for Metric Trees , 2015, ISAAC.