Holons Visual Representation for Image Retrieval

Along with the enlargement of image scale, convolutional local features, such as SIFT, are ineffective for representing or indexing and more compact visual representations are required. Due to the intrinsic mechanism, the state-of-the-art vector of locally aggregated descriptors (VLAD) has a few limits. Based on this, we propose a new descriptor named holons visual representation (HVR). The proposed HVR is a derivative mutational self-contained combination of global and local information. It exploits both global characteristics and the statistic information of local descriptors in the image dataset. It also takes advantages of local features of each image and computes their distribution with respect to the entire local descriptor space. Accordingly, the HVR is computed by a two-layer hierarchical scheme, which splits the local feature space and obtains raw partitions, as well as the corresponding refined partitions. Then, according to the distances from the centroids of partition spaces to local features and their spatial correlation, we assign the local features into their nearest raw partitions and refined partitions to obtain the global description of an image. Compared with VLAD, HVR holds critical structure information and enhances the discriminative power of individual representation with a small amount of computation cost, while using the same memory overhead. Extensive experiments on several benchmark datasets demonstrate that the proposed HVR outperforms conventional approaches in terms of scalability as well as retrieval accuracy for images with similar intra local information.

[1]  Ebroul Izquierdo,et al.  Histology Image Retrieval in Optimized Multifeature Spaces , 2013, IEEE Journal of Biomedical and Health Informatics.

[2]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ebroul Izquierdo,et al.  A Biologically Inspired System for Classification of Natural Images , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Qi Tian,et al.  Uniting Keypoints: Local Visual Information Fusion for Large-Scale Image Search , 2015, IEEE Transactions on Multimedia.

[5]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Tsuhan Chen,et al.  Image retrieval with geometry-preserving visual phrases , 2011, CVPR 2011.

[7]  Zheng-Jun Zha,et al.  Difficulty Guided Image Retrieval Using Linear Multiple Feature Embedding , 2012, IEEE Transactions on Multimedia.

[8]  Jia Deng,et al.  A large-scale hierarchical image database , 2009, CVPR 2009.

[9]  Yuning Jiang,et al.  Randomized visual phrases for object search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[11]  Ebroul Izquierdo,et al.  A Multi-feature Optimization Approach to Object-Based Image Classification , 2006, CIVR.

[12]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.

[14]  Ebroul Izquierdo,et al.  Combining Low-level Features for Improved Classification and Retrieval of Histology Images , 2010, Trans. Mass Data Anal. Images Signals.

[15]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Shipeng Li,et al.  Image Relevance Prediction Using Query-Context Bag-of-Object Retrieval Model , 2014, IEEE Transactions on Multimedia.

[17]  Ebroul Izquierdo,et al.  Global-to-local oriented perception on blurry visual information , 2008, 2008 15th IEEE International Conference on Image Processing.

[18]  Yongdong Zhang,et al.  Contextual Query Expansion for Image Retrieval , 2014, IEEE Transactions on Multimedia.

[19]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Jing Liu,et al.  Robust Structured Subspace Learning for Data Representation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[22]  Xiaojun Qi,et al.  Semantic clusters based manifold ranking for image retrieval , 2011, 2011 18th IEEE International Conference on Image Processing.

[23]  B. S. Manjunath,et al.  PixNet: A Localized Feature Representation for Classification and Visual Search , 2015, IEEE Transactions on Multimedia.

[24]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Andrew Zisserman,et al.  All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Zhe L. Lin,et al.  A Local Bag-of-Features Model for Large-Scale Object Retrieval , 2010, ECCV.

[29]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[30]  Patrick Pérez,et al.  Revisiting the VLAD image representation , 2013, ACM Multimedia.

[31]  Cordelia Schmid,et al.  Accurate Image Search Using the Contextual Dissimilarity Measure , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Frédéric Jurie,et al.  Visual word disambiguation by semantic contexts , 2011, 2011 International Conference on Computer Vision.

[33]  Yannis Avrithis,et al.  Speeded-up, relaxed spatial matching , 2011, 2011 International Conference on Computer Vision.

[34]  M. Rukoz,et al.  Embedding spatial information into image content description for scene retrieval , 2010, Pattern Recognit..

[35]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.

[36]  Michael Isard,et al.  Descriptor Learning for Efficient Retrieval , 2010, ECCV.

[37]  Yan Liang,et al.  Compact feature based clustering for large-scale image retrieval , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[38]  Nenghai Yu,et al.  Scale-Invariant Visual Language Modeling for Object Categorization , 2009, IEEE Trans. Multim..

[39]  Changhu Wang,et al.  Spatial-bag-of-features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[43]  Rongrong Ji,et al.  Weak attributes for large-scale image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[45]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Le Dong,et al.  Scene-Oriented Hierarchical Classification of Blurry and Noisy Images , 2012, IEEE Transactions on Image Processing.

[47]  Gang Hua,et al.  Building contextual visual vocabulary for large-scale image applications , 2010, ACM Multimedia.

[48]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[49]  Jinhui Tang,et al.  Unsupervised Feature Selection via Nonnegative Spectral Analysis and Redundancy Control , 2015, IEEE Transactions on Image Processing.