Clustering Presentation of Web Image Retrieval Results using Textual Information and Image Features

The increasing prevalence of broadband Internet access is making it easier to obtain rich contents like images, and more people are attempting image retrieval.We focus on how to present web image retrieval results to users. Most retrieval results contain multiple topics. To offset this complexity, many papers have discussed text retrieval result clustering [11][14]. In result clustering, we cluster the documents according to their topics by using the distance of text similarity. To group web image retrieval results, we have to consider the differences between image retrieval and text retrieval. First, most images on the Web do not have any textual information, so we have to automatically extract textual information if we are to index web images by semantic information. Second, text retrieval shows users text snippets as results which may not contain the information that user wants; however, thumbnail images are direct reduced-size versions of the originals, so the user can clearly figure out if the original image is desired or not. So, we think that how to present retrieval results is an important task in web image retrieval.In this paper, we describe how to semantically classify image retrieval results for making web image retrieval more effective. Text classification based on machine learning is used to generate basic semantic information, and image features and textual features are used for cluster presentation. We propose methods for presenting the results of image retrieval through the application of clustering. Experiments show that our procedure is effective.

[1]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[2]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[3]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[4]  Keiji Yanai,et al.  Image collector: an image-gathering system from the world-wide web employing keyword-based search engines , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[5]  W. Bruce Croft,et al.  An Evaluation of Techniques for Clustering Search Results , 2005 .

[6]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[7]  Aya Soffer,et al.  PicASHOW: pictorial authority search by hyperlinks on the web , 2002, ACM Trans. Inf. Syst..

[8]  Chinatsu Aone,et al.  Fast and effective text mining using linear-time document clustering , 1999, KDD '99.

[9]  Shinichiro Takagi,et al.  Japanese Morphological Analyzer using Word Co-occurence -JTAG , 1998, COLING-ACL.

[10]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[12]  S. Sclaroff,et al.  Combining textual and visual cues for content-based image retrieval on the World Wide Web , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[13]  Masahiko Yachida,et al.  Image clustering system on WWW using Web texts , 2004, Fourth International Conference on Hybrid Intelligent Systems (HIS'04).