Representative local features mining for large-scale near-duplicates retrieval

Local features have been widely used in many computer vision related researches, such as near-duplicate image and video retrieval. However, the storage and query cost of local features become prohibitive on large-scale database. In this paper, we propose a representative local features mining method to generate a compact but more effective feature subset. First, we do an unsupervised annotation for all similar images(or frames in video) in the database. Second, we compute a comprehensive score for every local feature. The score function combines the robustness and discrimination. Finally, we sort all the local features in an image by their scores and the low-score local features can be removed. The selected local features are robust and discriminative, which can guarantee the better retrieval quality than using full of the original feature set. By our method, the number of local features can be significantly reduced and a large amount of storage and computational cost can be saved. The experimental results show that we can use 30% of the features to get a better query performance than that of full feature set.

[1]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[2]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[3]  Jian Sun,et al.  Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Meng Wang,et al.  Spectral Hashing With Semantically Consistent Graph for Image Indexing , 2013, IEEE Transactions on Multimedia.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[10]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[11]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[12]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[14]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[17]  Yongdong Zhang,et al.  Topology preserving hashing for similarity search , 2013, MM '13.