A variant of k-nearest neighbors search with cyclically permuted query points for rotation-invariant image processing

The well-known k -nearest neighbors problem ( k NN ) involves building a data structure that reports the k closest training points to each of a given set of query points, with all points being in a given metric space S . The problem discussed here is an important operation in rotation-invariant image processing. It consists of a nontrivial variant of k NN : given a set of training points X and a set of query points Y find, for each query point y ? Y , the k nearest training points to y , where the notion of distance is given by a pseudometric of S defined over cyclic permutations of y and the elements of X . The multiplicity of the query point permutations makes serial brute force search too costly for instances of practical size. We present a transformation that enables any instance of the variant to be solved as a k NN problem. Although this enables the application of any k NN algorithm, the transformation is too time costly to be practical for instances of practical dimensions. For this reason, we present a condensation algorithm for the efficient elimination of unfavorable training points (that cannot be among the k closest neighbors of y ) and an effective parallel programming approach based on the discrete Fourier transform. The significant speedup gained over brute force search on practical datasets is reported. We also provide the mathematical basis of a conjecture that, if true, would enable the speedup to be significantly improved. The application of the approach to classification by support vector machines and k -means clustering is also discussed.

[1]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[3]  Nick G. Kingsbury,et al.  Enhanced Shift and Scale Tolerance for Rotation Invariant Polar Matching With Dual-Tree Wavelets , 2011, IEEE Transactions on Image Processing.

[4]  Jim Hefferon,et al.  Linear Algebra , 2012 .

[5]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[6]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[7]  Dianhong Wang,et al.  Survey of Improving K-Nearest-Neighbor for Classification , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[8]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[9]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[12]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[13]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[14]  Anton Korobeynikov,et al.  Computation- and Space-Efficient Implementation of SSA , 2009, ArXiv.

[15]  Gene H. Golub,et al.  Matrix computations , 1983 .

[16]  Paul S. Bradley,et al.  Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[17]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[18]  Nick G. Kingsbury,et al.  Rotation-invariant local feature matching with complex wavelets , 2006, 2006 14th European Signal Processing Conference.

[19]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[20]  Miloud-Aouidate Amal,et al.  Survey of Nearest Neighbor Condensing Techniques , 2011 .

[21]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[22]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[23]  Christos Faloutsos,et al.  Efficient and effective Querying by Image Content , 1994, Journal of Intelligent Information Systems.

[24]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[25]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[26]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[27]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.