Discriminative Multi-View Privileged Information Learning for Image Re-Ranking

Conventional multi-view re-ranking methods usually perform asymmetrical matching between the region of interest (ROI) in the query image and the whole target image for similarity computation. Due to the inconsistency in the visual appearance, this practice tends to degrade the retrieval accuracy particularly when the image ROI, which is usually interpreted as the image objectness, accounts for a smaller region in the image. Since Privileged Information (PI), which can be viewed as the image prior, is able to characterize well the image objectness, we are aiming at leveraging PI for further improving the performance of multi-view re-ranking in this paper. Towards this end, we propose a discriminative multi-view re-ranking approach in which both the original global image visual contents and the local auxiliary PI features are simultaneously integrated into a unified training framework for generating the latent subspaces with sufficient discriminating power. For the on-the-fly re-ranking, since the multi-view PI features are unavailable, we only project the original multi-view image representations onto the latent subspace, and thus the re-ranking can be achieved by computing and sorting the distances from the multi-view embeddings to the separating hyperplane. Extensive experimental evaluations on the two public benchmarks, Oxford5k and Paris6k, reveal that our approach provides further performance boost for accurate image re-ranking, whilst the comparative study demonstrates the advantage of our method against other multi-view re-ranking methods.

[1]  Silvio Savarese,et al.  Deep Learning Under Privileged Information Using Heteroscedastic Dropout , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Xiangyang Wang,et al.  An image retrieval scheme with relevance feedback using feature reconstruction and SVM reclassification , 2014, Neurocomputing.

[4]  Simon Osindero,et al.  Cross-Dimensional Weighting for Aggregated Deep Convolutional Features , 2015, ECCV Workshops.

[5]  Shin'ichi Satoh,et al.  Faster R-CNN Features for Instance Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Jingjing Tang,et al.  Multiview Privileged Support Vector Machines , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Noel E. O'Connor,et al.  Bags of Local Convolutional Features for Scalable Instance Search , 2016, ICMR.

[8]  Chong-Wah Ngo,et al.  Click-through-based cross-view learning for image search , 2014, SIGIR.

[9]  Jun Yu,et al.  Click Prediction for Web Image Reranking Using Multimodal Sparse Coding , 2014, IEEE Transactions on Image Processing.

[10]  Vladimir Cherkassky,et al.  Connection between SVM+ and multi-task learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[11]  Meng Wang,et al.  Person Re-Identification With Metric Learning Using Privileged Information , 2018, IEEE Transactions on Image Processing.

[12]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Changyin Sun,et al.  Discriminative Multi-View Interactive Image Re-Ranking , 2017, IEEE Transactions on Image Processing.

[14]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[15]  Donald A. Adjeroh,et al.  Information Bottleneck Learning Using Privileged Information for Visual Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Dong Xu,et al.  Distance Metric Learning Using Privileged Information for Face Verification and Person Re-Identification , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Luc Van Gool,et al.  Fast Algorithms for Linear and Kernel SVM+ , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Larry S. Davis,et al.  Re-ranking by Multi-feature Fusion with Diffusion for Image Retrieval , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[19]  Tae-Kyun Kim,et al.  Learning and Refining of Privileged Information-Based RNNs for Action Recognition from Depth Sequences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[21]  Jian Sun,et al.  Salient object detection by composition , 2011, 2011 International Conference on Computer Vision.

[22]  Rongrong Ji,et al.  Weakly Supervised Multi-Graph Learning for Robust Image Reranking , 2014, IEEE Transactions on Multimedia.

[23]  Jianru Xue,et al.  Building discriminative CNN image representations for object retrieval using the replicator equation , 2018, Pattern Recognit..

[24]  Martha Larson,et al.  Pairwise geometric matching for large-scale object retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Uwe Aickelin,et al.  Privileged information for data clustering , 2012, Inf. Sci..

[26]  Jianru Xue,et al.  Deep Feature Aggregation and Image Re-Ranking With Heat Diffusion for Image Retrieval , 2018, IEEE Transactions on Multimedia.

[27]  Yi Yang,et al.  Image Classification by Cross-Media Active Learning With Privileged Information , 2016, IEEE Transactions on Multimedia.

[28]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[29]  Luc Van Gool,et al.  Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors , 2011, CVPR 2011.

[30]  Zheng-Jun Zha,et al.  Difficulty Guided Image Retrieval Using Linear Multiple Feature Embedding , 2012, IEEE Transactions on Multimedia.

[31]  Y. Rui,et al.  Learning to Rank Using User Clicks and Visual Features for Image Retrieval , 2015, IEEE Transactions on Cybernetics.

[32]  Zhongfei Zhang,et al.  Manifold regularized cross-modal embedding for zero-shot learning , 2017, Inf. Sci..

[33]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[34]  Dacheng Tao,et al.  Database Saliency for Fast Image Retrieval , 2015, IEEE Transactions on Multimedia.

[35]  Andrew Zisserman,et al.  Triangulation Embedding and Democratic Aggregation for Image Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Dacheng Tao,et al.  Multi-View Intact Space Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Christoph H. Lampert,et al.  Learning to Rank Using Privileged Information , 2013, 2013 IEEE International Conference on Computer Vision.

[40]  Masatoshi Okutomi,et al.  Visual Place Recognition with Repetitive Structures , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Jun Yu,et al.  Exploiting Click Constraints and Multi-view Features for Image Re-ranking , 2014, IEEE Transactions on Multimedia.

[42]  Dacheng Tao,et al.  ROMIR: Robust Multi-View Image Re-Ranking , 2019, IEEE Transactions on Knowledge and Data Engineering.

[43]  Albert Gordo,et al.  Beyond Instance-Level Image Retrieval: Leveraging Captions to Learn a Global Visual Representation for Semantic Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Shih-Fu Chang,et al.  Attributes and categories for generic instance search from one example , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Jianfei Cai,et al.  MIML-FCN+: Multi-Instance Multi-Label Learning via Fully Convolutional Networks with Privileged Information , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Victor S. Lempitsky,et al.  Aggregating Deep Convolutional Features for Image Retrieval , 2015, ArXiv.

[47]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[50]  Andrew Zisserman,et al.  All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Peter Tiño,et al.  Incorporating Privileged Information Through Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.