论文信息 - Multilabel Image Annotation Based on Double-Layer PLSA Model

Multilabel Image Annotation Based on Double-Layer PLSA Model

Due to the semantic gap between visual features and semantic concepts, automatic image annotation has become a difficult issue in computer vision recently. We propose a new image multilabel annotation method based on double-layer probabilistic latent semantic analysis (PLSA) in this paper. The new double-layer PLSA model is constructed to bridge the low-level visual features and high-level semantic concepts of images for effective image understanding. The low-level features of images are represented as visual words by Bag-of-Words model; latent semantic topics are obtained by the first layer PLSA from two aspects of visual and texture, respectively. Furthermore, we adopt the second layer PLSA to fuse the visual and texture latent semantic topics and achieve a top-layer latent semantic topic. By the double-layer PLSA, the relationships between visual features and semantic concepts of images are established, and we can predict the labels of new images by their low-level features. Experimental results demonstrate that our automatic image annotation model based on double-layer PLSA can achieve promising performance for labeling and outperform previous methods on standard Corel dataset.

Zhihua Chen | Jing Zhang | Da Li | Weiwei Hu | Yubo Yuan

[1] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[2] Huseyin Gokhan Akcay,et al. Automated detection of objects using multiple hierarchical segmentations , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[3] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4] Daniel Gatica-Perez,et al. Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Zhong-Zhi Shi,et al. Automatic Image Annotation by Fusing Semantic Topics: Automatic Image Annotation by Fusing Semantic Topics , 2011 .

[7] Shi Zhongzhi,et al. Automatic Image Annotation by Fusing Semantic Topics , 2011 .

[8] R. Manmatha,et al. A Model for Learning the Semantics of Pictures , 2003, NIPS.

[9] Nenghai Yu,et al. Semantics-preserving bag-of-words models for efficient image annotation , 2009, LS-MMRM '09.

[10] R. Manmatha,et al. Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[11] Jing Zhang,et al. Representation of image content with multi-scale segmentation , 2013, 2013 International Conference on Machine Learning and Cybernetics.

[12] R. Manmatha,et al. Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13] Nenghai Yu,et al. Image Classification via Semi-supervised pLSA , 2009, 2009 Fifth International Conference on Image and Graphics.

[14] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16] Thomas Hofmann,et al. Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[17] Yong Wang,et al. Combining global, regional and contextual features for automatic image annotation , 2009, Pattern Recognit..

[18] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[19] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.

[20] Jing Zhang,et al. Multi-label image annotation based on multi-model , 2013, ICUIMC '13.

[21] R. Manmatha,et al. Multiple Bernoulli relevance models for image and video annotation , 2004, CVPR 2004.

[22] F. Jurie,et al. Category Level Object Segmentation by Combining Bag-of-words Models and Markov Random Fields , 2008 .