TOP-SIFT: the selected SIFT descriptor based on dictionary learning

The large amount of SIFT descriptors in an image and the high dimensionality of SIFT descriptor have made problems for the large-scale image database in terms of speed and scalability. In this paper, we present a descriptor selection algorithm based on dictionary learning to remove the redundant features and reserve only a small set of features, which we refer to as TOP-SIFTs. During the experiment, we discovered the inner relativity between the problem of descriptor selection and dictionary learning in sparse representation, and then turned our problem into dictionary learning. We designed a new dictionary learning method to adapt our problem and employed the simulated annealing algorithm to obtain the optimal solution. During the process of learning, we added the sparsity constraint and spatial distribution characteristic of SIFT points. And lastly selected the small representative feature set with good spatial distribution. Compared with the earlier methods, our method is neither relying on the database nor losing important information, and the experiments have shown that our algorithm can save memory space a lot and increase time efficiency while maintaining the accuracy as well.

[1]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Jun Jie Foo,et al.  Pruning SIFT for Scalable Near-duplicate Image Matching , 2007, ADC.

[3]  Richard Szeliski,et al.  Multi-image matching using multi-scale oriented patches , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[5]  Mohammad Amin Sadeghi,et al.  Poisson Local Color Correction for Image Stitching , 2008, VISAPP.

[6]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[7]  Silvio Savarese,et al.  Dense Object Reconstruction with Semantic Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Hefeng Wu,et al.  Robust tracking via discriminative sparse feature selection , 2014, The Visual Computer.

[9]  Jianping Fan,et al.  Image collection summarization via dictionary learning for sparse representation , 2013, Pattern Recognit..

[10]  C. Schmid,et al.  Hamming Embedding and Weak Geometry Consistency for Large Scale Image Search - extended version , 2008 .

[11]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Zahraa Yasseen,et al.  View selection for sketch-based 3D model retrieval using visual part shape description , 2016, The Visual Computer.

[14]  Kjersti Engan,et al.  Frame based signal compression using method of optimal directions (MOD) , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[15]  Panu Turcot,et al.  Better matching with fewer features: The selection of useful features in large database recognition problems , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16]  R. Cipolla,et al.  Stable Interest Points for Improved Image Retrieval and Matching , 2006 .

[17]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[18]  Yong Jae Lee,et al.  Foreground Focus: Unsupervised Learning from Partially Matching Images , 2009, International Journal of Computer Vision.

[19]  F. Dellaert,et al.  Large-Scale Dense 3D Reconstruction from Stereo Imagery , 2013 .

[20]  Wei Xu,et al.  Performance evaluation of color correction approaches for automatic multi-view image and video stitching , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[22]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Zhaoquan Cai,et al.  Facial age estimation by using stacked feature composition and selection , 2016, The Visual Computer.

[24]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[26]  Tomás Pajdla,et al.  Avoiding Confusing Features in Place Recognition , 2010, ECCV.

[27]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28]  Ning Zhou,et al.  Jointly Learning Visually Correlated Dictionaries for Large-Scale Visual Recognition Applications. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[29]  Jianping Fan,et al.  TOP-SIFT: A New Method for SIFT Descriptor Selection , 2015, 2015 IEEE International Conference on Multimedia Big Data.

[30]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..