Recovery from Non-Decomposable Distance Oracles

A line of work has looked at the problem of recovering an input from distance queries . In this setting, there is an unknown sequence s ∈ { 0 , 1 } ≤ n , and one chooses a set of queries y ∈ { 0 , 1 } O ( n ) and receives d ( s, y ) for a distance function d . The goal is to make as few queries as possible to recover s . Although this problem is well-studied for decomposable distances, i.e., distances of the form d ( s, y ) = (cid:80) ni =1 f ( s i , y i ) for some function f , which includes the important cases of Hamming distance, (cid:96) p -norms, and M -estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important special cases such as edit distance, dynamic time warping (DTW), Fréchet distance, earth mover’s distance, and so on. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. We also study the role of adaptivity for a number of different distance functions. One motivation for understanding non-adaptivity is that the query sequence can be fixed and the distances of the input to the queries provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.

[1]  Koen van Greevenbroek,et al.  Approximating Length-Restricted Means under Dynamic Time Warping , 2021, WAOA.

[2]  Jehoshua Bruck,et al.  Trace Reconstruction with Bounded Edit Distance , 2021, 2021 IEEE International Symposium on Information Theory (ISIT).

[3]  Silvio Lattanzi,et al.  Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries , 2021, COLT.

[4]  A. Sunjaya,et al.  Pooled Testing for Expanding COVID-19 Mass Surveillance , 2020, Disaster Medicine and Public Health Preparedness.

[5]  Yael Mandel-Gutfreund,et al.  Evaluation of COVID-19 RT-qPCR test in multi-sample pools , 2020, medRxiv.

[6]  Rolf Niedermeier,et al.  Faster Binary Mean Computation Under Dynamic Time Warping , 2020, CPM.

[7]  David P. Woodruff,et al.  The Query Complexity of Mastermind with lp Distances , 2019, APPROX-RANDOM.

[8]  David P. Woodruff,et al.  The One-Way Communication Complexity of Dynamic Time Warping Distance , 2019, Symposium on Computational Geometry.

[9]  Matthew Aldridge,et al.  Group testing: an information theory perspective , 2019, Found. Trends Commun. Inf. Theory.

[10]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[11]  Kurt Mehlhorn,et al.  The Query Complexity of a Permutation-Based Variant of Mastermind , 2019, Discret. Appl. Math..

[12]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[13]  Nikita Polyanskii,et al.  On the metric dimension of Cartesian powers of a graph , 2017, J. Comb. Theory, Ser. A.

[14]  Sanguthevar Rajasekaran,et al.  DTWNet: a Dynamic Time Warping Network , 2019, NeurIPS.

[15]  Moses Charikar,et al.  On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings , 2018, ICALP.

[16]  Gad M. Landau,et al.  Period recovery of strings over the Hamming and edit distances , 2017, Theor. Comput. Sci..

[17]  Chao Wang,et al.  Optimal Nested Test Plan for Combinatorial Quantitative Group Testing , 2014, IEEE Transactions on Signal Processing.

[18]  Qin Zhang,et al.  Edit Distance: Sketching, Streaming, and Document Exchange , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[19]  Michal Koucký,et al.  Streaming algorithms for embedding and computing edit distance in the low distance regime , 2016, STOC.

[20]  Amir Abboud,et al.  Tight Hardness Results for LCS and Other Sequence Similarity Measures , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[21]  Juan A. Rodríguez-Velázquez,et al.  On the strong metric dimension of Cartesian and direct products of graphs , 2014, Discret. Math..

[22]  R. Vershynin Lectures in Geometric Functional Analysis , 2012 .

[23]  Alexandr Andoni,et al.  Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[24]  J. Spencer,et al.  The Elementary Proof of the Prime Number Theorem , 2009 .

[25]  Nader H. Bshouty,et al.  Optimal Algorithms for the Coin Weighing Problem with a Spring Scale , 2009, COLT.

[26]  Yuval Rabani,et al.  Improved lower bounds for embeddings into L1 , 2006, SODA '06.

[27]  Robert Krauthgamer,et al.  Embedding the Ulam metric into l1 , 2006, Theory Comput..

[28]  Boris Aronov,et al.  Fréchet Distance for Curves, Revisited , 2006, ESA.

[29]  Subhash Khot,et al.  Nonembeddability theorems via Fourier analysis , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[30]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[31]  Alexandr Andoni,et al.  Lower bounds for embedding edit distance into normed spaces , 2003, SODA '03.

[32]  Graham Cormode,et al.  Sequence distance embeddings , 2003 .

[33]  H. Mannila,et al.  Computing Discrete Fréchet Distance ∗ , 1994 .

[34]  Paul M. B. Vitányi,et al.  Combinatorics and Kolmogorov complexity , 1991, [1991] Proceedings of the Sixth Annual Structure in Complexity Theory Conference.

[35]  D. Knuth The Computer as Master Mind , 1977 .

[36]  W. H. Mills,et al.  Determination of a Subset from Certain Combinatorial Properties , 1966, Canadian Journal of Mathematics.

[37]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[38]  Milton Sobel,et al.  OPTIMAL GROUP TESTING. , 1964 .

[39]  H. S. Shapiro,et al.  A Combinatory Detection Problem , 1963 .

[40]  A. Sterrett On the Detection of Defective Members of Large Populations , 1957 .