Featured correspondence topic model for semantic search on social image collections

A new framework to retrieve semantically relevant images from the social database.Probabilistic topic model to predict the missing tags and remove the noisy ones.Two algorithms for the estimation of model parameters and tag correspondence.The scoring scheme relies on the fusion of visual and textual information.The outperformance of image annotation and retrieval to state-of-the-art methods. Nowadays, due to the rapid growth of digital technologies, huge volumes of image data are created and shared on social media sites. User-provided tags attached to each social image are widely recognized as a bridge to fill the semantic gap between low-level image features and high-level concepts. Hence, a combination of images along with their corresponding tags is useful for intelligent retrieval systems, those are designed to gain high-level understanding from images and facilitate semantic search. However, user-provided tags in practice are usually incomplete and noisy, which may degrade the retrieval performance. To tackle this problem, we present a novel retrieval framework that automatically associates the visual content with textual tags and enables effective image search. To this end, we first propose a probabilistic topic model learned on social images to discover latent topics from the co-occurrence of tags and image features. Moreover, our topic model is built by exploiting the expert knowledge about the correlation between tags with visual contents and the relationship among image features that is formulated in terms of spatial location and color distribution. The discovered topics then help to predict missing tags of an unseen image as well as the ones partially labeled in the database. These predicted tags can greatly facilitate the reliable measure of semantic similarity between the query and database images. Therefore, we further present a scoring scheme to estimate the similarity by fusing textual tags and visual representation. Extensive experiments conducted on three benchmark datasets show that our topic model provides the accurate annotation against the noise and incompleteness of tags. Using our generalized scoring scheme, which is particularly advantageous to many types of queries, the proposed approach also outperforms state-of-the-art approaches in terms of retrieval accuracy.

[1]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[2]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[3]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[4]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Yueting Zhuang,et al.  Tag Clustering and Refinement on Semantic Unity Graph , 2011, 2011 IEEE 11th International Conference on Data Mining.

[6]  Tao Mei,et al.  Image tag refinement by regularized latent Dirichlet allocation , 2013, Comput. Vis. Image Underst..

[7]  Lei Wu,et al.  Tag Completion for Image Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Dong Liu,et al.  Image retagging , 2010, ACM Multimedia.

[9]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[10]  Jianmin Wang,et al.  Image Tag Completion via Image-Specific and Tag-Specific Linear Sparse Reconstructions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Yi Yang,et al.  Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[12]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[14]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[15]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Nuno Vasconcelos,et al.  Latent Dirichlet Allocation Models for Image Classification , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Shih-Fu Chang,et al.  Active Context-Based Concept Fusionwith Partial User Labels , 2006, 2006 International Conference on Image Processing.

[18]  Xiaochun Cao,et al.  SLED: Semantic Label Embedding Dictionary Representation for Multilabel Image Annotation , 2015, IEEE Transactions on Image Processing.

[19]  Ralf Krestel,et al.  Tag Recommendation Using Probabilistic Topic Models , 2009, DC@PKDD/ECML.

[20]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[21]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[22]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[23]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Jianping Fan,et al.  A regularized optimization framework for tag completion and image retrieval , 2015, Neurocomputing.

[25]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Wesley De Neve,et al.  Tag refinement in an image folksonomy using visual similarity and tag co-occurrence statistics , 2010, Signal Process. Image Commun..

[27]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[28]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[29]  Motoaki Kawanabe,et al.  Enhanced representation and multi-task learning for image annotation , 2013, Comput. Vis. Image Underst..

[30]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[31]  Lihi Zelnik-Manor,et al.  Large Scale Max-Margin Multi-Label Classification with Priors , 2010, ICML.

[32]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[33]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[34]  Steven C. H. Hoi,et al.  A two-view learning approach for image tag ranking , 2011, WSDM '11.

[35]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, CVPR.

[36]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[37]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[39]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[40]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Young-Koo Lee,et al.  Semantic image retrieval using correspondence topic model with background distribution , 2016, 2016 International Conference on Big Data and Smart Computing (BigComp).

[42]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[43]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.

[44]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[45]  Ning Zhou,et al.  A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Hans-Peter Kriegel,et al.  Hierarchical Bayesian Models for Collaborative Tagging Systems , 2009, 2009 Ninth IEEE International Conference on Data Mining.