Nearest Neighbors Using Compact Sparse Codes

In this paper, we propose a novel scheme for approximate nearest neighbor (ANN) retrieval based on dictionary learning and sparse coding. Our key innovation is to build compact codes, dubbed SpANN codes, using the active set of sparse coded data. These codes are then used to index an inverted file table for fast retrieval. The active sets are often found to be sensitive to small differences among data points, resulting in only near duplicate retrieval. We show that this sensitivity is related to the coherence of the dictionary; small coherence resulting in better retrieval. To this end, we propose a novel dictionary learning formulation with incoherence constraints and an efficient method to solve it. Experiments are conducted on two state-of-the-art computer vision datasets with 1M data points and show an order of magnitude improvement in retrieval accuracy without sacrificing memory and query time compared to the state-of-the-art methods.

[1]  Zi Huang,et al.  Sparse hashing for fast multimedia search , 2013, TOIS.

[2]  Gang Hua,et al.  Picking the best DAISY , 2009, CVPR.

[3]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[4]  Svetlana Lazebnik,et al.  Asymmetric Distances for Binary Embeddings , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[6]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[7]  Gunther Heidemann,et al.  A Sparse Coding Based Similarity Measure , 2009, DMIN.

[8]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[10]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hong Cheng,et al.  Sparsity-Induced Similarity Measure and Its Applications , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Cordelia Schmid,et al.  Vector Quantizing Feature Space with a Regular Lattice , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[17]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[18]  Michael Isard,et al.  General Theory , 1969 .

[19]  Antonio Torralba,et al.  Semi-Supervised Learning in Gigantic Image Collections , 2009, NIPS.

[20]  Vassilios Morellas,et al.  Robust Sparse Hashing , 2012, 2012 19th IEEE International Conference on Image Processing.

[21]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[22]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[23]  ChumOndrej,et al.  Large-Scale Discovery of Spatially Related Images , 2010 .

[24]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  John J. Benedetto,et al.  Geometric Properties of Grassmannian Frames for and , 2006, EURASIP J. Adv. Signal Process..

[26]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[27]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[28]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[29]  Inderjit S. Dhillon,et al.  A non-monotonic method for large-scale non-negative least squares , 2013, Optim. Methods Softw..

[30]  M. Gromov,et al.  Monotonicity of the volume of intersection of balls , 1987 .

[31]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[32]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[33]  Jiri Matas,et al.  Large-Scale Discovery of Spatially Related Images , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Hongbin Zha,et al.  Incoherent dictionary learning for sparse representation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[35]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[36]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[37]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[39]  Mike E. Davies,et al.  Structured and incoherent parametric dictionary design , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[41]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[42]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[43]  John J. Benedetto,et al.  Geometric Properties of Grassmannian Frames for R 2 and R 3 , 2004 .

[44]  Christine Guillemot,et al.  Approximate nearest neighbors using sparse representations , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45]  Luc Van Gool,et al.  Nested Sparse Quantization for Efficient Feature Coding , 2012, ECCV.

[46]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[47]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[48]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.