An Optimistic Model for Searching Web Directories

Web directories are taxonomies for the classification of Web documents using a directed acyclic graph of categories. This paper introduces an optimistic model for Web directories that improves the performance of restricted searches. This model considers the directed acyclic graph of categories as a tree with some “exceptions”. The validity of this optimistic model has been analysed by developing and comparing it with a basic model and a hybrid model with partial information. The proposed model is able to improve in 50% the response time of a basic model, and with respect to the hybrid model, both systems provide similar response time, except for large answers. In this case, the optimistic model outperforms the hybrid model in approximately 61%. Moreover, in a saturated workload environment the optimistic model proved to perform better than the basic and hybrid models for all type of queries.

[1]  Ángel Viña,et al.  Experiences retrieving information in the world wide web , 2001, Proceedings. Sixth IEEE Symposium on Computers and Communications.

[2]  Fabio Crestani,et al.  Lectures on Information Retrieval , 2001, Lecture Notes in Computer Science.

[3]  Jan O. Pedersen,et al.  Optimization for dynamic inverted index maintenance , 1989, SIGIR '90.

[4]  Edward A. Fox,et al.  Inverted Files , 1992, Information Retrieval: Data Structures & Algorithms.

[5]  Kotagiri Ramamohanarao,et al.  Inverted files versus signature files for text indexing , 1998, TODS.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  Timothy W. Finin,et al.  Yahoo! as an ontology: using Yahoo! categories to describe documents , 1999, CIKM '99.

[8]  C.S. Roberts,et al.  Partial-match retrieval via the method of superimposed codes , 1979, Proceedings of the IEEE.

[9]  Massimo Melucci,et al.  Information Retrieval on the Web , 2001, ESSIR.

[10]  Simon Stiassny Mathematical analysis of various superimposed coding methods , 1960 .

[11]  Kotagiri Ramamohanarao,et al.  Guidelines for presentation and comparison of indexing techniques , 1996, SGMD.

[12]  Balachander Krishnamurthy,et al.  Focusing search in hierarchical structures with directory sets , 1998, CIKM '98.

[13]  Christos Faloutsos,et al.  Description and performance analysis of signature file methods for office filing , 1987, TOIS.

[14]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[15]  Sriram Raghavan,et al.  Searching the Web , 2001, ACM Trans. Internet Techn..

[16]  Victor Carneiro,et al.  Optimization of Restricted Searches in Web Directories Using Hybrid Data Structures , 2003, ECIR.