Uniting Keypoints: Local Visual Information Fusion for Large-Scale Image Search

In this paper, we propose a novel approach to address the problem of the huge amount of local features for a large-scale database. First, in each image the local features are organized into dozens of groups by performing the standard k-means clustering algorithm on their spatial positions. Second, a compact descriptor is generated to describe the visual information of each group of local features. Since, in each image, thousands of local features are reorganized into only dozens of groups and each group is described by a single descriptor, the total amount of descriptors in a large-scale database will be greatly reduced. Therefore, we can reduce the complexity of the searching procedure significantly. Further, the generated group descriptors are encoded into binary format to achieve the storage and computation efficiency. The experiments on two benchmark datasets, i.e., UKBench and Holidays, with the Flickr1M distractor database demonstrate the effectiveness of the proposed approach.

[1]  Yannis Avrithis,et al.  Hough Pyramid Matching: Speeded-Up Geometry Re-ranking for Large Scale Image Retrieval , 2014, International Journal of Computer Vision.

[2]  Hervé Jégou,et al.  Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[3]  Qi Tian,et al.  Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search , 2014, IEEE Transactions on Image Processing.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Jing-Ming Guo,et al.  Content-Based Image Retrieval Using Error Diffusion Block Truncation Coding Features , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[10]  Qi Tian,et al.  Contextual Hashing for Large-Scale Image Search , 2014, IEEE Transactions on Image Processing.

[11]  Qi Tian,et al.  Scalar quantization for large scale image search , 2012, ACM Multimedia.

[12]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[13]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Shengjin Wang,et al.  Visual Phraselet: Refining Spatial Constraints for Large Scale Image Search , 2013, IEEE Signal Processing Letters.

[15]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Lei Zhang,et al.  IGroup: presenting web image search results in semantic clusters , 2007, CHI.

[19]  Meng Wang,et al.  Movie2Comics: Towards a Lively Video Content Presentation , 2012, IEEE Transactions on Multimedia.

[20]  Qi Tian,et al.  Embedding spatial context information into inverted filefor large-scale image retrieval , 2012, ACM Multimedia.

[21]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Yong Rui,et al.  Image search—from thousands to billions in 20 years , 2013, TOMCCAP.

[23]  Andrew Zisserman,et al.  All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Qi Tian,et al.  SIFT match verification by geometric coding for large-scale partial-duplicate web image search , 2013, TOMCCAP.

[25]  Qi Tian,et al.  Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search , 2014, IEEE Transactions on Multimedia.

[26]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yue Gao,et al.  Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval , 2013, ACM Multimedia.

[28]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[29]  Shiliang Zhang,et al.  Semantic-Aware Co-indexing for Image Retrieval , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Xuelong Li,et al.  Image Annotation by Multiple-Instance Learning With Discriminative Feature Mapping and Selection , 2014, IEEE Transactions on Cybernetics.

[31]  Qingming Huang,et al.  Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval , 2013, IEEE Transactions on Multimedia.

[32]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[33]  Shih-Fu Chang,et al.  Mobile product search with Bag of Hash Bits and boundary reranking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Shiliang Zhang,et al.  Edge-SIFT: Discriminative Binary Descriptor for Scalable Partial-Duplicate Mobile Search , 2013, IEEE Transactions on Image Processing.

[35]  Xuelong Li,et al.  Improving Level Set Method for Fast Auroral Oval Segmentation , 2014, IEEE Transactions on Image Processing.

[36]  Qi Tian,et al.  Ieee Transactions on Image Processing Spatial Pooling of Heterogeneous Features for Image Classification , 2022 .

[37]  Xiao Zhang,et al.  QsRank: Query-sensitive hash code ranking for efficient ∊-neighbor search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Cees Snoek,et al.  Landmark image retrieval using visual synonyms , 2010, ACM Multimedia.

[39]  Jing-Ming Guo,et al.  Content-Based Image Retrieval Using Features Extracted From Halftoning-Based Block Truncation Coding , 2015, IEEE Transactions on Image Processing.

[40]  Qi Tian,et al.  Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb , 2014, Comput. Vis. Image Underst..

[41]  Shumeet Baluja,et al.  Pagerank for product image search , 2008, WWW.

[42]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[43]  Marcel Worring,et al.  Fusing concept detection and geo context for visual search , 2012, ICMR.

[44]  Xuelong Li,et al.  An Efficient MRF Embedded Level Set Method for Image Segmentation , 2015, IEEE Transactions on Image Processing.

[45]  Qi Tian,et al.  Lp-Norm IDF for Large Scale Image Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Tsuhan Chen,et al.  Image retrieval with geometry-preserving visual phrases , 2011, CVPR 2011.