Searching the Web Using Screenshots

Many online articles contain useful know-how knowledge about GUI applications. Even though these articles tend to be richly illustrated by screenshots, no system has been designed to take advantage of these screenshots to visually search know-how articles effectively. In this paper, we present a novel system to index and search software knowhow articles that leverages the visual correspondences between screenshots. To retrieve articles about an application, users can take a screenshot of the application to query the system and retrieve a list of articles containing a matching screenshot. Useful snippets such as captions, references, and nearby text are automatically extracted from the retrieved articles and shown alongside with the thumbnails of the matching screenshots as excerpts for relevancy judgement. Retrieved articles are ranked by a comprehensive set of visual, textual, and site features, whose weights are learned by RankSVM. Our prototype system currently contains 150k articles that are classified into walkthrough, book, gallery, and general categories. We demonstrated the system’s ability to retrieve matching screenshots for a wide variety of programs, across language boundaries, and provide subjectively more useful results than keyword-based web and image search engines.

[1]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[2]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[3]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[4]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[5]  Shumeet Baluja,et al.  Pagerank for product image search , 2008, WWW.

[6]  Ximena Olivares,et al.  Visual diversification of image search results , 2009, WWW '09.

[7]  Valentin Tablan,et al.  Web-assisted annotation, semantic indexing and search of television and radio news , 2005, WWW '05.

[8]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, CVPR.

[9]  Doreen Böhnstedt,et al.  Towards language-independent web genre detection , 2009, WWW '09.

[10]  Bhaskar Mehta,et al.  Detecting image spam using visual features and near duplicate detection , 2008, WWW.

[11]  Kentaro Toyama,et al.  Optimal audio-visual representations for illiterate users of computers , 2007, WWW '07.

[12]  David R. Karger,et al.  Exhibit: lightweight structured data publishing , 2007, WWW '07.

[13]  Shuming Shi,et al.  Improving relevance judgment of web search results with image excerpts , 2008, WWW.

[14]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[15]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[16]  Naren Ramakrishnan,et al.  Staging transformations for multimodal web interaction management , 2003, WWW '04.

[17]  Jian Hu,et al.  Mining multilingual topics from wikipedia , 2009, WWW '09.

[18]  Kumiko Tanaka-Ishii,et al.  A multilingual usage consultation tool based on internet searching: more than a search engine, less than QA , 2005, WWW '05.

[19]  Aya Soffer,et al.  PicASHOW: pictorial authority search by hyperlinks on the web , 2002, ACM Trans. Inf. Syst..

[20]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[21]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.