Improving the similarity estimation via score distribution

Generally distance-based similarity estimation between two images is not always reliable due to the limitations in both image understanding techniques and distance measure methods. This paper presents a novel approach for improving the similarity estimation through introducing the distribution information of similarity scores. The key idea is based on an underlying assumption that the distributions of similarity scores are similar for true-relevant images when they query an independent database. By representing each distribution with the area under the corresponding similarity score curve, the difference between different distributions can be easily calculated and employed to update the original distance measure. Experiments on three public datasets with various feature representations show that the enhanced similarity estimation remarkably outperforms the original distance measure and the proposed approach also keeps a good generalization ability on various datasets and feature representations.

[1]  Yao Zhao,et al.  Neighborhood reversibility verifying for image search , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[2]  Qi Tian,et al.  Query-adaptive late fusion for image search and person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[4]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[6]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[10]  Jitendra Malik,et al.  Image Retrieval and Classification Using Local Distance Functions , 2006, NIPS.

[11]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[12]  Ming Yang,et al.  Query Specific Fusion for Image Retrieval , 2012, ECCV.

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Akio Yamada,et al.  The MPEG-7 color layout descriptor: a compact image feature description for high-speed image/video segment retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Qi Tian,et al.  Cross-Indexing of Binary SIFT Codes for Large-Scale Image Search , 2014, IEEE Transactions on Image Processing.

[18]  Ronald Fagin,et al.  Efficient similarity search and classification via rank aggregation , 2003, SIGMOD '03.

[19]  Anil K. Jain,et al.  Likelihood Ratio-Based Biometric Score Fusion , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Cordelia Schmid,et al.  Accurate Image Search Using the Contextual Dissimilarity Measure , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Moshe Zakai General error criteria (Corresp.) , 1964, IEEE Trans. Inf. Theory.

[22]  Yao Zhao,et al.  Modality-Dependent Cross-Media Retrieval , 2015, ACM Trans. Intell. Syst. Technol..

[23]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.