A Reliable Order-Statistics-Based Approximate Nearest Neighbor Search Algorithm

We propose a new algorithm for fast approximate nearest neighbor search based on the properties of ordered vectors. Data vectors are classified based on the index and sign of their largest components, thereby partitioning the space in a number of cones centered in the origin. The query is itself classified, and the search starts from the selected cone and proceeds to neighboring ones. Overall, the proposed algorithm corresponds to locality sensitive hashing in the space of directions, with hashing based on the order of components. Thanks to the statistical features emerging through ordering, it deals very well with the challenging case of unstructured data, and is a valuable building block for more complex techniques dealing with structured data. Experiments on both simulated and real-world data prove the proposed algorithm to provide a state-of-the-art performance.

[1]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[3]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Vassilios Morellas,et al.  Efficient Nearest Neighbors via Robust Sparse Hashing , 2014, IEEE Transactions on Image Processing.

[5]  Robert M. Gray,et al.  An Improvement of the Minimum Distortion Encoding Algorithm for Vector Quantization , 1985, IEEE Trans. Commun..

[6]  Kave Eshghi,et al.  Locality sensitive hash functions based on concomitant rank order statistics , 2008, KDD.

[7]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[8]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[9]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[10]  Jean-Michel Morel,et al.  A Review of Image Denoising Algorithms, with a New One , 2005, Multiscale Model. Simul..

[11]  Shai Avidan,et al.  Coherency Sensitive Hashing , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Hailin Jin,et al.  Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Heng Tao Shen,et al.  Hashing for Similarity Search: A Survey , 2014, ArXiv.

[15]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[16]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[17]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[18]  Jonathan Brandt,et al.  Transform coding for fast approximate nearest neighbor search in high dimensions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[20]  N. J. A. Sloane,et al.  Sphere Packings, Lattices and Groups , 1987, Grundlehren der mathematischen Wissenschaften.

[21]  Wu-Jun Li,et al.  Isotropic Hashing , 2012, NIPS.

[22]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[23]  Gonzalo Navarro,et al.  Effective Proximity Retrieval by Ordering Permutations , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Christian Riess,et al.  Ieee Transactions on Information Forensics and Security an Evaluation of Popular Copy-move Forgery Detection Approaches , 2022 .

[25]  Anoop Cherian Nearest Neighbors Using Compact Sparse Codes , 2014, ICML.

[26]  Yuzuru Tanaka,et al.  Spherical LSH for Approximate Nearest Neighbor Search on Unit Hypersphere , 2007, WADS.

[27]  Christine Guillemot,et al.  Image Inpainting : Overview and Recent Advances , 2014, IEEE Signal Processing Magazine.

[28]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Tomokazu Sato,et al.  What is the Most EfficientWay to Select Nearest Neighbor Candidates for Fast Approximate Nearest Neighbor Search? , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[31]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[33]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[34]  Davide Cozzolino,et al.  Efficient Dense-Field Copy–Move Forgery Detection , 2015, IEEE Transactions on Information Forensics and Security.

[35]  K. Kise,et al.  What Is the Most Efficient Way to Select Nearest Neighbor Candidates for Fast Approximate Nearest Neighbor Search ? , 2013 .

[36]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[39]  I. Miller Probability, Random Variables, and Stochastic Processes , 1966 .

[40]  Victor S. Lempitsky,et al.  The Inverted Multi-Index , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Heung-Kyu Lee,et al.  Rotation Invariant Localization of Duplicated Image Regions Based on Zernike Moments , 2013, IEEE Transactions on Information Forensics and Security.

[42]  Yannis Avrithis,et al.  Locally Optimized Product Quantization for Approximate Nearest Neighbor Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.