MetaSpider: Meta-searching and categorization on the Web

It has become increasingly difficult to locate relevant information on the Web, even with the help of Web search engines. Two approaches to addressing the low precision and poor presentation of search results of current search tools are studied: meta-search and document categorization. Meta-search engines improve precision by selecting and integrating search results from generic or domain-specific Web search engines or other resources. Document categorization promises better organization and presentation of retrieved results. This article introduces MetaSpider, a meta-search engine that has real-time indexing and categorizing functions. We report in this paper the major components of MetaSpider and discuss related technical approaches. Initial results of a user evaluation study comparing MetaSpider, NorthernLight, and MetaCrawler in terms of clustering performance and of time and effort expended show that MetaSpider performed best in precision rate, but disclose no statistically significant differences in recall rate and time requirements. Our experimental study also reveals that MetaSpider exhibited a higher level of automation than the other two systems and facilitated efficient searching by providing the user with an organized, comprehensive view of the retrieved documents.

[1]  Teuvo Kohonen,et al.  Exploration of large document collections by self-organizing maps , 1998 .

[2]  Nicholas J. Belkin,et al.  Evaluation of a tool for visualization of information retrieval results , 1996, SIGIR '96.

[3]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[4]  Charles L. A. Clarke,et al.  Efficient construction of large test collections , 1998, SIGIR '98.

[5]  Amit Singhal AT&T at TREC-6 , 1997, TREC.

[6]  Xia Lin,et al.  Map Displays for Information Retrieval , 1997, J. Am. Soc. Inf. Sci..

[7]  C. Lee Giles,et al.  Accessibility of information on the Web , 2000, INTL.

[8]  Jacek Gwizdka,et al.  Discriminating Meta-Search: A Framework for Evaluation , 1999, Inf. Process. Manag..

[9]  Jaideep Srivastava,et al.  First 20 precision among World Wide Web search services (search engines) , 1999 .

[10]  Hsinchun Chen,et al.  Internet Categorization and Search: A Self-Organizing Approach , 1996, J. Vis. Commun. Image Represent..

[11]  Jay F. Nunamaker,et al.  A graphical, self-organizing approach to classifying electronic meeting output , 1997 .

[12]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[13]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[14]  Oren Etzioni,et al.  Multi-Service Search and Comparison Using the MetaCrawler , 1995 .

[15]  Nancy Garman Meta Search Engines. , 1999 .

[16]  Oren Zamir,et al.  Visualization of Search Results in Document Retrieval Systems-General Examination , 1998 .

[17]  Anton Leuski Evaluating a Visual Presentation of Retrieved Documents , 1999 .

[18]  Hsinchun Chen,et al.  Comparing noun phrasing techniques for use with medical digital library tools , 2000 .

[19]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[20]  Anton Leuski,et al.  The Best of Both Worlds: Combining Ranked List and Clustering, , 1999 .

[21]  Marti A. Hearst TileBars: visualization of term distribution information in full text information access , 1995, CHI '95.

[22]  Hsinchun Chen,et al.  Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques , 1998, J. Am. Soc. Inf. Sci..

[23]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[24]  Oren Etzioni,et al.  Multi-Engine Search and Comparison Using the MetaCrawler , 1995, World Wide Web J..

[25]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[26]  Gary Marchionini,et al.  A self-organizing semantic map for information retrieval , 1991, SIGIR '91.

[27]  Oren Etzioni,et al.  The MetaCrawler architecture for resource aggregation on the Web , 1997 .

[28]  Teuvo Kohonen,et al.  Exploration of very large databases by self-organizing maps , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).