Automatic web image selection with a probabilistic latent topic model

We propose a new method to select relevant images to the given keywords from images gathered from theWeb based on the Probabilistic Latent Semantic Analysis (PLSA) model which is a probabilistic latent topic model originally proposed for text document analysis. The experimental results shows that the results by the proposed method is almost equivalent to or outperforms the results by existing methods. In addition, it is proved that our method can select more various images compared to the existing SVM-based methods.

[1]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  Keiji Yanai,et al.  Probabilistic web image gathering , 2005, MIR '05.

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Keiji Yanai Image collector III: a web image-gathering system with bag-of-keypoints , 2007, WWW '07.

[7]  Keiji Yanai,et al.  Generic image classification using visual knowledge on the web , 2003, ACM Multimedia.