Sparse Online Learning of Image Similarity

Learning image similarity plays a critical role in real-world multimedia information retrieval applications, especially in Content-Based Image Retrieval (CBIR) tasks, in which an accurate retrieval of visually similar objects largely relies on an effective image similarity function. Crafting a good similarity function is very challenging because visual contents of images are often represented as feature vectors in high-dimensional spaces, for example, via bag-of-words (BoW) representations, and traditional rigid similarity functions, for example, cosine similarity, are often suboptimal for CBIR tasks. In this article, we address this fundamental problem, that is, learning to optimize image similarity with sparse and high-dimensional representations from large-scale training data, and propose a novel scheme of Sparse Online Learning of Image Similarity (SOLIS). In contrast to many existing image-similarity learning algorithms that are designed to work with low-dimensional data, SOLIS is able to learn image similarity from large-scale image data in sparse and high-dimensional spaces. Our encouraging results showed that the proposed new technique achieves highly competitive accuracy as compared to the state-of-the-art approaches but enjoys significant advantages in computational efficiency, model sparsity, and retrieval scalability, making it more practical for real-world multimedia retrieval applications.

[1]  Jintao Li,et al.  A novel method for geographical social event detection in social media , 2013, ICIMCS '13.

[2]  Gang Wang,et al.  Using Dependent Regions for Object Categorization in a Generative Framework , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[4]  Xingming Sun,et al.  Segmentation-Based Image Copy-Move Forgery Detection Scheme , 2015, IEEE Transactions on Information Forensics and Security.

[5]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.

[6]  Ji Wan,et al.  HDIdx: High-Dimensional Indexing for Efficient Approximate Nearest Neighbor Search , 2017, Neurocomputing.

[7]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  Peng Jin,et al.  Fast reference frame selection based on content similarity for low complexity HEVC encoder , 2016, J. Vis. Commun. Image Represent..

[9]  Arnold W. M. Smeulders,et al.  PicToSeek: combining color and shape invariant features for image retrieval , 2000, IEEE Trans. Image Process..

[10]  Nenghai Yu,et al.  Semantics-Preserving Bag-of-Words Models and Applications , 2010, IEEE Transactions on Image Processing.

[11]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[12]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[15]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[16]  John Langford,et al.  Sparse Online Learning via Truncated Gradient , 2008, NIPS.

[17]  Yongdong Zhang,et al.  Adaptive weighted imbalance learning with application to abnormal activity recognition , 2016, Neurocomputing.

[18]  Chunyan Miao,et al.  Online Multi-Modal Distance Metric Learning with Application to Image Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[19]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[20]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Shuangquan Wang,et al.  Unobtrusive Sensing Incremental Social Contexts Using Fuzzy Class Incremental Learning , 2015, 2015 IEEE International Conference on Data Mining.

[22]  Xingming Sun,et al.  Effective and Efficient Image Copy Detection with Resistance to Arbitrary Rotation , 2016, IEICE Trans. Inf. Syst..

[23]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[24]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[25]  Tao Mei,et al.  Image Similarity , 2009, Encyclopedia of Database Systems.

[26]  Gang Chen,et al.  Color Image Analysis by Quaternion-Type Moments , 2014, Journal of Mathematical Imaging and Vision.

[27]  Ji Wan,et al.  SOML: Sparse Online Metric Learning with Application to Image Retrieval , 2014, AAAI.

[28]  Yuhui Zheng,et al.  Image segmentation by generalized hierarchical fuzzy C-means algorithm , 2015, J. Intell. Fuzzy Syst..

[29]  Bart Thomee,et al.  New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[30]  Chong-Wah Ngo,et al.  Click-through-based cross-view learning for image search , 2014, SIGIR.

[31]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity through Ranking , 2009, IbPRIA.

[32]  Xingming Sun,et al.  Effective and Efficient Global Context Verification for Image Copy Detection , 2017, IEEE Transactions on Information Forensics and Security.

[33]  Ye Xu,et al.  Multi-instance Metric Learning , 2011, 2011 IEEE 11th International Conference on Data Mining.

[34]  Jintao Li,et al.  GeSoDeck: a geo-social event detection and tracking system , 2013, MM '13.

[35]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[36]  Wei Liu,et al.  Semi-supervised distance metric learning for Collaborative Image Retrieval , 2008, CVPR.

[37]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Junsong Yuan,et al.  Boosting cross-media retrieval via visual-auditory feature analysis and relevance feedback , 2014, ACM Multimedia.

[39]  Hongping Cai,et al.  Learning weights for codebook in image classification and retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Yong Wang,et al.  Coherent image annotation by learning semantic distance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Xingming Sun,et al.  Fast Motion Estimation Based on Content Property for Low-Complexity H.265/HEVC Encoder , 2016, IEEE Transactions on Broadcasting.

[42]  Shuangquan Wang,et al.  Inferring social contextual behavior from bluetooth traces , 2013, UbiComp.

[43]  Jean-Marc Odobez,et al.  A Thousand Words in a Scene , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Ivor W. Tsang,et al.  Learning Sparse Confidence-Weighted Classifier on Very High Dimensional Data , 2016, AAAI.

[45]  Ian D. Reid,et al.  Modeling and generating complex motion blur for real-time tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[47]  Zhihua Xia,et al.  A Privacy-Preserving and Copy-Deterrence Content-Based Image Retrieval Scheme in Cloud Computing , 2016, IEEE Transactions on Information Forensics and Security.

[48]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[49]  Yoram Singer,et al.  Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[50]  Yi Yang,et al.  A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Hui Zhang,et al.  Localized Content-Based Image Retrieval , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[54]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[55]  Steven C. H. Hoi,et al.  LIBOL: a library for online learning algorithms , 2014, J. Mach. Learn. Res..

[56]  Sam Kwong,et al.  Efficient Motion and Disparity Estimation Optimization for Low Complexity Multiview Video Coding , 2015, IEEE Transactions on Broadcasting.

[57]  Jun Rekimoto,et al.  Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication , 2013, UbiComp 2013.

[58]  Rong Jin,et al.  Online Multiple Kernel Similarity Learning for Visual Search. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[59]  Shiguo Lian,et al.  Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics , 2017, Multimedia Tools and Applications.

[60]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..