论文信息 - An Efficient Sum Query Algorithm for Distance-Based Locally Dominating Functions

An Efficient Sum Query Algorithm for Distance-Based Locally Dominating Functions

In this paper, we consider the following sum query problem: Given a point set P in $${\mathbb {R}}^d$$ R d , and a distance-based function f ( p , q ) ( i.e., a function of the distance between p and q ) satisfying some general properties, the goal is to develop a data structure and a query algorithm for efficiently computing a $$(1+\epsilon )$$ ( 1 + ϵ ) -approximate solution to the sum $$\sum _{p \in P} f(p,q)$$ ∑ p ∈ P f ( p , q ) for any query point $$q \in {\mathbb {R}}^d$$ q ∈ R d and any small constant $$\epsilon >0$$ ϵ > 0 . Existing techniques for this problem are mainly based on some core-set techniques which often have difficulties to deal with functions with local domination property. Based on several new insights to this problem, we develop in this paper a novel technique to overcome these encountered difficulties. Our algorithm is capable of answering queries with high success probability in time no more than $${\tilde{O}}_{\epsilon ,d}(n^{0.5 + c})$$ O ~ ϵ , d ( n 0.5 + c ) , and the underlying data structure can be constructed in $${\tilde{O}}_{\epsilon ,d}(n^{1+c})$$ O ~ ϵ , d ( n 1 + c ) time for any $$c>0$$ c > 0 , where the hidden constant has only polynomial dependence on $$1/\epsilon$$ 1 / ϵ and d . Our technique is simple and can be easily implemented for practical purpose.

Jinhui Xu | Ziyun Huang

[1] Yufei Tao,et al. Dynamic top-k range reporting in external memory , 2012, PODS '12.

[2] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[3] Francis R. Bach,et al. On the Equivalence between Herding and Conditional Gradient Algorithms , 2012, ICML.

[4] Piotr Indyk,et al. Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[5] Kasturi R. Varadarajan,et al. Geometric Approximation via Coresets , 2007 .

[6] Sariel Har-Peled,et al. On coresets for k-means and k-median clustering , 2004, STOC '04.

[7] Yufei Tao,et al. Efficient Top-k Indexing via General Reductions , 2016, PODS.

[8] Ke Chen,et al. On Coresets for k-Median and k-Means Clustering in Metric and Euclidean Spaces and Their Applications , 2009, SIAM J. Comput..

[9] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[10] Alexandr Andoni,et al. Beyond Locality-Sensitive Hashing , 2013, SODA.

[11] Alexandr Andoni,et al. Nearest neighbor search : the old, the new, and the impossible , 2009 .