On the Cost of Fixed Partial Match Queries in K-d Trees

Partial match queries constitute the most basic type of associative queries in multidimensional data structures such as $$K$$K-d trees or quadtrees. Given a query $$\mathbf {q}=(q_0,\ldots ,q_{K-1})$$q=(q0,…,qK-1) where s of the coordinates are specified and $$K-s$$K-s are left unspecified ($$q_i=*$$qi=∗), a partial match search returns the subset of data points $$\mathbf {x}=(x_0,\ldots ,x_{K-1})$$x=(x0,…,xK-1) in the data structure that match the given query, that is, the data points such that $$x_i=q_i$$xi=qi whenever $$q_i\not =*$$qi≠∗. There exists a wealth of results about the cost of partial match searches in many different multidimensional data structures, but most of these results deal with random queries. Only recently a few papers have begun to investigate the cost of partial match queries with a fixed query $$\mathbf {q}$$q. This paper represents a new contribution in this direction, giving a detailed asymptotic estimate of the expected cost $$P_{{n},\mathbf {q}}$$Pn,q for a given fixed query $$\mathbf {q}$$q. From previous results on the cost of partial matches with a fixed query and the ones presented here, a deeper understanding is emerging, uncovering the following functional shape for $$P_{{n},\mathbf {q}}$$Pn,q$$\begin{aligned} P_{{n},\mathbf {q}} = \nu \cdot \left( \prod _{i:q_i\text { is specified}}\, q_i(1-q_i)\right) ^{\alpha /2}\cdot n^\alpha + \text {l.o.t.} \end{aligned}$$Pn,q=ν·∏i:qiis specifiedqi(1-qi)α/2·nα+l.o.t.(l.o.t. lower order terms, throughout this work) in many multidimensional data structures, which differ only in the exponent $$\alpha $$α and the constant $$\nu $$ν, both dependent on s and K, and, for some data structures, on the whole pattern of specified and unspecified coordinates in $$\mathbf {q}$$q as well. Although it is tempting to conjecture that this functional shape is “universal”, we have shown experimentally that it seems not to be true for a variant of $$K$$K-d trees called squarish $$K$$K-d trees.

[1]  Philippe Flajolet,et al.  Singularity Analysis of Generating Functions , 1990, SIAM J. Discret. Math..

[2]  Conrado Martínez,et al.  Selection by rank in K-dimensional binary search trees , 2014, Random Struct. Algorithms.

[3]  Conrado Martínez,et al.  On the average performance of orthogonal range search in multidimensional data structures , 2002, J. Algorithms.

[4]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[5]  Philippe Flajolet,et al.  Partial match retrieval of multidimensional data , 1986, JACM.

[6]  Philippe Flajolet,et al.  Analysis of KDT-Trees: KD-Trees Improved by Local Reogranisations , 1989, WADS.

[7]  Conrado Martínez,et al.  Randomized K-Dimensional Binary Search Trees , 1998, ISAAC.

[8]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[9]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[10]  P. Flajolet,et al.  Analytic Combinatorics: RANDOM STRUCTURES , 2009 .

[11]  Philippe Flajolet,et al.  Analytic Combinatorics , 2009 .

[12]  Luc Devroye,et al.  Squarish k-d Trees , 2000, SIAM J. Comput..

[13]  Hsien-Kuei Hwang,et al.  Partial Match Queries in Random k-d Trees , 2006, SIAM J. Comput..

[14]  Helmut Prodinger,et al.  Partial match queries in relaxed multidimensional search trees , 2001, Algorithmica.

[15]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[16]  Luc Devroye,et al.  Analysis of range search for random k-d trees , 2001, Acta Informatica.

[17]  Nicolas Curien,et al.  Partial match queries in two-dimensional quadtrees: a probabilistic approach , 2010, Advances in Applied Probability.

[18]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[19]  Nicolas Broutin,et al.  A limit process for partial match queries in random quadtrees , 2012, ArXiv.