Leveraging auxiliary text terms for automatic image annotation
暂无分享,去创建一个
This paper proposes a novel algorithm to annotate web images by automatically aligning the images with their most relevant auxiliary text terms. First, the DOM-based web page segmentation is performed to extract images and their most relevant auxiliary text blocks. Second, automatic image clustering is used to partition the web images into a set of groups according to their visual similarity contexts, which significantly reduces the uncertainty on the relatedness between the images and their auxiliary terms. The semantics of the visually-similar images in the same cluster are then described by the same ranked list of terms which frequently co-occur in their text blocks. Finally, a relevance re-ranking process is performed over a term correlation network to further refine the ranked term list. Our experiments on a large-scale database of web pages have provided very positive results.
[1] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.
[2] R. Manmatha,et al. Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[3] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.