Optimizing kd-trees for scalable visual descriptor indexing

In this paper, we attempt to scale up the kd-tree indexing methods for large-scale vision applications, e.g., indexing a large number of SIFT features and other types of visual descriptors. To this end, we propose an effective approach to generate near-optimal binary space partitioning and need low time cost to access the nodes in the query stage. First, we relax the coordinate-axis-alignment constraint in partition axis selection used in conventional kd-trees, and form a partition axis with the great variance by combining a few coordinate axes in a binary manner for each node, which yields better space partitioning and requires almost the same time cost to visit internal nodes during the query stage thanks to cheap projection operations. Then, we introduce a simple but very effective scheme to guarantee the partition axis of each internal node is orthogonal to or parallel with those of its ancestors, which leads to efficient distance computation between a query point and the cell associated with each node and yields fast priority search. Compared with the conventional kd-trees, our approach takes a little more tree construction time, but obtains much better nearest neighbor search performance. Experimental results on large scale local patch indexing and image search with tiny images show that our approach outperforms the state-of-the-art kd-tree based indexing methods.

[1]  Robert F. Sproull,et al.  Refinements to nearest-neighbor searching ink-dimensional trees , 1991, Algorithmica.

[2]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[3]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[5]  Matthew A. Brown,et al.  Recognising panoramas , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Sunil Arya,et al.  Algorithms for fast vector quantization , 1993, [Proceedings] DCC `93: Data Compression Conference.

[7]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[8]  Andrew W. Moore,et al.  The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data , 2000, UAI.

[9]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[10]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Shree K. Nayar,et al.  What Is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images? , 2008, ECCV.

[14]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[16]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[19]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[21]  Kellogg S. Booth,et al.  Heuristics for ray tracing using space subdivision , 1990, The Visual Computer.

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[23]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Jiri Matas,et al.  Improving Descriptors for Fast Tree Matching by Optimal Linear Projection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  James McNames,et al.  A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[27]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[28]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[29]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[30]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[31]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.