Approximate all nearest neighbor search for high dimensional entropy estimation for image registration

Information theoretic criteria such as mutual information are often used as similarity measures for inter-modality image registration. For better performance, it is useful to consider vector-valued pixel features. However, this leads to the task of estimating entropy in medium to high dimensional spaces, for which standard histogram entropy estimator is not usable. We have therefore previously proposed to use a nearest neighbor-based Kozachenko-Leonenko (KL) entropy estimator. Here we address the issue of determining a suitable all nearest neighbor (NN) search algorithm for this relatively specific task. We evaluate several well-known state-of-the-art standard algorithms based on k-d trees (FLANN), balanced box decomposition (BBD) trees (ANN), and locality sensitive hashing (LSH), using publicly available implementations. In addition, we present our own method, which is based on k-d trees with several enhancements and is tailored for this particular application. We conclude that all tree-based methods perform acceptably well, with our method being the fastest and most suitable for the all-NN search task needed by the KL estimator on image data, while the ANN and especially FLANN methods being most often the fastest on other types of data. On the other hand, LSH is found the least suitable, with the brute force search being the slowest.

[1]  Bernd Hamann,et al.  Discrete Sibson interpolation , 2006, IEEE Transactions on Visualization and Computer Graphics.

[2]  Max A. Viergever,et al.  Mutual-information-based registration of medical images: a survey , 2003, IEEE Transactions on Medical Imaging.

[3]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4]  Sanjeev R. Kulkarni,et al.  A Nearest-Neighbor Approach to Estimating Divergence between Continuous Random Vectors , 2006, 2006 IEEE International Symposium on Information Theory.

[5]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[6]  A. Kraskov,et al.  Erratum: Estimating mutual information [Phys. Rev. E 69, 066138 (2004)] , 2011 .

[7]  S. Rao Kosaraju,et al.  A decomposition of multi-dimensional point-sets with applications to k-nearest-neighbors and n-body potential fields (preliminary version) , 1992, STOC '92.

[8]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[10]  Daniel Rueckert,et al.  Non-rigid registration using higher-order mutual information , 2000, Medical Imaging.

[11]  Guy Marchal,et al.  Multimodality image registration by maximization of mutual information , 1997, IEEE Transactions on Medical Imaging.

[12]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[13]  David M. Mount,et al.  Computing nearest neighbors for moving points and applications to clustering , 1999, SODA '99.

[14]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[15]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[16]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[17]  Jan Kybic,et al.  Approximate Best Bin First k-d Tree All Nearest Neighbor Search with Incremental Updates , 2010 .

[18]  J. Victor Binless strategies for estimation of information from neural data. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Jan Kybic,et al.  High-Dimensional Entropy Estimation for Finite Accuracy Data: R-NN Entropy Estimator , 2007, IPMI.

[20]  Alfred O. Hero,et al.  On Local Intrinsic Dimension Estimation and Its Applications , 2010, IEEE Transactions on Signal Processing.

[21]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[22]  Jose A. Costa,et al.  Manifold learning using Euclidean k-nearest neighbor graphs [image processing examples] , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[24]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[25]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[26]  Craig A. Stewart,et al.  Introduction to computational biology , 2005 .

[27]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[28]  P. Gács,et al.  Algorithms , 1992 .

[29]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[30]  Leonidas J. Guibas,et al.  Kinetic Medians and kd-Trees , 2002, ESA.

[31]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[32]  David Eppstein,et al.  Dynamic half-space reporting, geometric optimization, and minimum spanning trees , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[33]  Shin'ichi Satoh,et al.  SR‐tree: An index structure for nearest‐neighbor searching of high‐dimensional point data , 1997 .

[34]  Pravin M. Vaidya,et al.  AnO(n logn) algorithm for the all-nearest-neighbors Problem , 1989, Discret. Comput. Geom..

[35]  David Eppstein,et al.  The skip quadtree: a simple dynamic data structure for multidimensional data , 2005, SCG.

[36]  Luc Devroye,et al.  Analysis of range search for random k-d trees , 2001, Acta Informatica.

[37]  Victor Mergel On some properties of Kozachenko-Leonenko estimates and maximum entropy principle in goodness of fit tests construction , 2002 .

[38]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[39]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[40]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[41]  Wayne D. Blizard,et al.  Multiset Theory , 1989, Notre Dame J. Formal Log..

[42]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[43]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[44]  Michael Unser,et al.  Optimization of mutual information for multiresolution image registration , 2000, IEEE Trans. Image Process..

[45]  Alfred O. Hero,et al.  Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..

[46]  Donald E. Knuth,et al.  The art of computer programming. Vol.2: Seminumerical algorithms , 1981 .

[47]  Thomas P. Yunck,et al.  A Technique to Identify Nearest Neighbors , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[48]  Carlo Tomasi,et al.  Image Similarity Using Mutual Information of Regions , 2004, ECCV.

[49]  Panayiotis Bozanis,et al.  LR-tree: a Logarithmic Decomposable Spatial Index Method , 2003, Comput. J..

[50]  Philippe Flajolet,et al.  Partial match retrieval of multidimensional data , 1986, JACM.

[51]  Jan Kybic High-dimensional mutual information estimation for image registration , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[52]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[53]  M WellsWilliam,et al.  Alignment by Maximization of Mutual Information , 1997 .

[54]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[55]  Stefan Wess,et al.  Using k-d Trees to Improve the Retrieval Step in Case-Based Reasoning , 1993, EWCBR.

[56]  Jan Kybic Incremental Updating of Nearest Neighbor-Based High-Dimensional Entropy Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[57]  F. J. Pelletier,et al.  316 Notre Dame Journal of Formal Logic , 1982 .

[58]  Michiel Smid,et al.  Closest-Point Problems in Computational Geometry , 2000, Handbook of Computational Geometry.

[59]  Sameer A. Nene,et al.  A simple algorithm for nearest neighbor search in high dimensions , 1997 .

[60]  Rina Panigrahy,et al.  Entropy based nearest neighbor search in high dimensions , 2005, SODA '06.

[61]  L. Goddard Information Theory , 1962, Nature.

[62]  S. Rao Kosaraju,et al.  A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields , 1995, JACM.

[63]  Forest Baskett,et al.  An Algorithm for Finding Nearest Neighbors , 1975, IEEE Transactions on Computers.

[64]  Ljubomir J. Buturovic,et al.  Improving k-nearest neighbor density and error estimates , 1993, Pattern Recognit..

[65]  Matthew Haines,et al.  Optimizing Search Strategies in k-d Trees , 2001 .

[66]  Kenneth L. Clarkson,et al.  Fast algorithms for the all nearest neighbors problem , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[67]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[68]  Mert R. Sabuncu,et al.  Spatial Information in Entropy-Based Image Registration , 2003, WBIR.

[69]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[70]  X. Liy Dynamic Algorithms in Computational Geometry , 2007 .

[71]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[72]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[73]  David G. Stork,et al.  Pattern Classification , 1973 .

[74]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[75]  Piotr Indyk,et al.  Nearest Neighbors in High-Dimensional Spaces , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[76]  Pietro Perona,et al.  Indexing in large scale image collections: Scaling properties and benchmark , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[77]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[78]  Peter N. Yianilos,et al.  Locally lifting the curse of dimensionality for nearest neighbor search (extended abstract) , 2000, SODA '00.

[79]  Kenneth L. Clarkson,et al.  Nearest Neighbor Queries in Metric Spaces , 1997, STOC '97.

[80]  Yaokai Feng,et al.  A Cost Model for Incremental Nearest Neighbor Search in Multidimensional Spaces , 2007 .

[81]  Guy Marchal,et al.  Multi-modality image registration by maximization of mutual information , 1996, Proceedings of the Workshop on Mathematical Methods in Biomedical Image Analysis.

[82]  Joseph O'Rourke,et al.  Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[83]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[84]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[85]  Max A. Viergever,et al.  Image registration by maximization of combined mutual information and gradient information , 2000, IEEE Transactions on Medical Imaging.

[86]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[87]  Jan Flusser,et al.  Image registration methods: a survey , 2003, Image Vis. Comput..