A Multimodal Multimedia Retrieval Model Based on pLSA

In this paper, we propose a multimodal multimedia retrieval model based on probabilistic Latent Semantic analysis (pLSA) to achieve multimodal retrieval. Firstly, We employ pLSA, to respectively simulate the generative processes of texts and images in the same documents. Then we employ the multivariate linear regression method to analyze the correlation between representations of texts and images and use the ordinary least squares (OLS) method to obtain the estimation of the regression matrix that can be used to transform between textual and visual modal data. Extensive experiments results demonstrate the effectiveness and efficiency of the proposed model.

[1]  C. V. Jawahar,et al.  Multi modal semantic indexing for image retrieval , 2010, CIVR '10.

[2]  Wei-Ying Ma,et al.  A probabilistic semantic model for image annotation and multi-modal image retrieval , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[4]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[7]  Xi Liu,et al.  Automatic image annotation with continuous PLSA , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[10]  Rainer Lienhart,et al.  Multilayer pLSA for multimodal image retrieval , 2009, CIVR '09.