Bringing order to the Web: automatically categorizing search results

We developed a user interface that organizes Web search results into hierarchical categories. Text classification algorithms were used to automatically classify arbitrary search results into an existing category structure on-the-fly. A user study compared our new category interface with the typical ranked list interface of search results. The study showed that the category interface is superior both in objective and subjective measures. Subjects liked the category interface much better than the list interface, and they were 50% faster at finding information that was organized into categories. Organizing search results allows users to focus on items in categories of interest rather than having to browse through all the results sequentially.

[1]  Ben Shneiderman,et al.  Tree-maps: a space-filling approach to the visualization of hierarchical information structures , 1991, Proceeding Visualization '91.

[2]  Israel Ben-Shaul,et al.  WebCutter: A System for Dynamic and Tailorable Site Mapping , 1997, Comput. Networks.

[3]  Susan T. Dumais,et al.  Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.

[4]  Marti A. Hearst,et al.  Searching and browsing text collections with large category hierarchies , 1997, CHI Extended Abstracts.

[5]  Kent Wittenburg,et al.  Integration of browsing, searching, and filtering in an applet for web information access , 1997, CHI Extended Abstracts.

[6]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[7]  Wanda Pratt,et al.  A Knowledge-Based Approach to Organizing Retrieved Documents , 1999, AAAI/IAAI.

[8]  Eli Upfal,et al.  Web search using automatic classification , 1996, WWW 1996.

[9]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[10]  Marti A. Hearst,et al.  Cha-Cha: A System for Organizing Intranet Search Results , 1999, USENIX Symposium on Internet Technologies and Systems.

[11]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[12]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[13]  Dunja Mladenic,et al.  Turning {{\sc Yahoo!}}\ into an automatic Web page classifier , 1998 .

[14]  Ben Shneiderman,et al.  Visualizing digital library search results with categorical and hierarchical axes , 2000, DL '00.

[16]  Prabhakar Raghavan,et al.  Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.

[17]  Gary Marchionini,et al.  Interfaces and Tools for the Library of Congress National Digital Library Program , 1998, Inf. Process. Manag..

[18]  Dunja Mladenic,et al.  Turning Yahoo to Automatic Web-Page Classifier , 1998, European Conference on Artificial Intelligence.

[19]  Wanda Pratt Dynamic organization of search results using the UMLS , 1997, AMIA.

[20]  Michael E. Lesk,et al.  Enhancing the usability of text through computer delivery and formative evaluation: the superbook pr , 1993 .