A two-stage hybrid probabilistic topic model for refining image annotation

Refining image annotation has become one of the core research topics in computer vision and pattern recognition due to its great potentials in image retrieval. However, it is still in its infancy and is not sophisticated enough to extract perfect semantic concepts just according to the image low-level features. In this paper, we propose a two-stage hybrid probabilistic topic model to improve the quality of automatic image annotation. To start with, a probabilistic latent semantic analysis model with asymmetric modalities is learned to estimate the posterior probabilities of each annotation keyword, during which the image-to-word relation can be well established. Next, a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels. By this way, the information from image low-level visual features and high-level semantic concepts can be seamlessly integrated by fully taking into account the word-to-word and image-to-image relations. Finally, the rank-two relaxation heuristics is exploited to further mine the correlation of the candidate annotations so as to capture the refining results, which plays a critical role in semantic based image retrieval. Extensive experiments show that the proposed model achieves not only superior annotation accuracy but also better retrieval performance.

[1]  Mohsen Fathian,et al.  A learning automata framework based on relevance feedback for content-based image retrieval , 2018, Int. J. Mach. Learn. Cybern..

[2]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[4]  Liu Zheng,et al.  Refining Image Annotation by Graph Partition and Image Search Engine , 2011 .

[5]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[6]  Qiu Zheng-ding A Novel Visual Words Definition Algorithm of Image Patch Based on Contextual Semantic Information , 2010 .

[7]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[8]  Yiannis Kompatsiaris,et al.  High order pLSA for indexing tagged images , 2013, Signal Process..

[9]  Rainer Lienhart,et al.  Multimodal Image Retrieval , 2012, International Journal of Multimedia Information Retrieval.

[10]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[11]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[12]  Rong Jin,et al.  Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Liu Zheng,et al.  MMDF-LDA: An improved Multi-Modal Latent Dirichlet Allocation model for social image annotation , 2018, Expert Syst. Appl..

[14]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[15]  David Dagan Feng,et al.  Hierarchical Gaussian Mixture Model for Image Annotation via PLSA , 2009, 2009 Fifth International Conference on Image and Graphics.

[16]  Peng Li,et al.  Correlated PLSA for Image Clustering , 2011, MMM.

[17]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[18]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[19]  Motoaki Kawanabe,et al.  Enhanced representation and multi-task learning for image annotation , 2013, Comput. Vis. Image Underst..

[20]  Yuqing Hou Image Annotation Incorporating Low-Rankness, Tag and Visual Correlation and Inhomogeneous Errors , 2015, ISVC.

[21]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[22]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[23]  Yu Zheng,et al.  Image Annotation with Concept Level Feature Using PLSA+CCA , 2011, MMM.

[24]  Zhongzhi Shi,et al.  Employing PLSA model and max-bisection for refining image annotation , 2013, 2013 IEEE International Conference on Image Processing.

[25]  Dongping Tian Research on PLSA Model based Semantic Image Analysis: A Systematic Review , 2018, J. Inf. Hiding Multim. Signal Process..

[26]  Latifur Khan,et al.  Knowledge Based Image Annotation Refinement , 2009, J. Signal Process. Syst..

[27]  Daniel Gatica-Perez,et al.  On image auto-annotation with latent space models , 2003, ACM Multimedia.

[28]  Xi Liu,et al.  Modeling continuous visual features for semantic image annotation and retrieval , 2011, Pattern Recognit. Lett..

[29]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[30]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Shuicheng Yan,et al.  Near-duplicate keyframe retrieval by nonrigid image matching , 2008, ACM Multimedia.

[32]  Hao Xu,et al.  Tag refinement by regularized LDA , 2009, ACM Multimedia.

[33]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[34]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[35]  Xi Liu,et al.  Fusing semantic aspects for image annotation and retrieval , 2010, J. Vis. Commun. Image Represent..

[36]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, CVPR 2004.

[37]  Latifur Khan,et al.  The randomized approximating graph algorithm for image annotation refinement problem , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[38]  Zheru Chi,et al.  An Adaptive Recognition Model for Image Annotation , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[39]  Ayman Farahat,et al.  Improving Probabilistic Latent Semantic Analysis with Principal Component Analysis , 2006, EACL.

[40]  Nafiz Arica,et al.  Scene Classification Using Spatial Pyramid of Latent Topics , 2010, 2010 20th International Conference on Pattern Recognition.

[41]  Kun Yang,et al.  Self-organizing weighted incremental probabilistic latent semantic analysis , 2018, Int. J. Mach. Learn. Cybern..

[42]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[43]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[45]  Wesley De Neve,et al.  Tag refinement in an image folksonomy using visual similarity and tag co-occurrence statistics , 2010, Signal Process. Image Commun..

[46]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[47]  Zhongzhi Shi,et al.  An Efficient Refining Image Annotation Technique by Combining Probabilistic Latent Semantic Analysis and Random Walk Model , 2014, Intell. Autom. Soft Comput..

[48]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[49]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[50]  田东平 Exploiting PLSA model and conditional random field for refining image annotation , 2015 .

[51]  Ning Zhou,et al.  A Hybrid Probabilistic Model for Unified Collaborative and Content-Based Image Tagging , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Seyed Navid Mohammadi Foumani,et al.  A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation , 2019, J. Vis. Commun. Image Represent..

[53]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[54]  Yanchun Liang,et al.  Support vector description of clusters for content-based image annotation , 2014, Pattern Recognit..

[55]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Junwei Han,et al.  Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA , 2013 .

[57]  Wesley De Neve,et al.  MAP-based image tag recommendation using a visual folksonomy , 2010, Pattern Recognit. Lett..

[58]  Zhiwu Lu,et al.  Image categorization via robust pLSA , 2010, Pattern Recognit. Lett..

[59]  Prakash Choudhary,et al.  Image annotation: Then and now , 2018, Image Vis. Comput..

[60]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[61]  Tao Mei,et al.  Image tag refinement by regularized latent Dirichlet allocation , 2013, Comput. Vis. Image Underst..

[62]  Yin Zhang,et al.  Rank-Two Relaxation Heuristics for MAX-CUT and Other Binary Quadratic Programs , 2002, SIAM J. Optim..