Efficient Queries over Web Views

Large web sites are becoming repositories of structured information that can benefit from being viewed and queried as relational databases. However, querying these views efficiently requires new techniques. Data usually resides at a remote site and is organized as a set of related HTML documents, with network access being a primary cost factor in query evaluation. This cost can be reduced by exploiting the redundancy often found in site design. We use a simple data model, a subset of the Araneus data model, to describe the structure of a web site. We augment the model with link and inclusion constraints that capture the redundancies in the site. We map relational views of a site to a navigational algebra and show how to use the constraints to rewrite algebraic expressions, reducing the number of network accesses.

[1]  Catriel Beeri,et al.  Algebraic Optimization of Object-Oriented Query Languages , 1990, Theor. Comput. Sci..

[2]  Abraham Silberschatz,et al.  Extended algebra and calculus for nested relational databases , 1988, TODS.

[3]  C. Mohan,et al.  Single Table Access Using Multiple Indexes: Optimization, Execution, and Concurrency Control Techniques , 1990, EDBT.

[4]  Arnon Rosenthal,et al.  An architecture for query optimization , 1982, SIGMOD '82.

[5]  Jiawei Han,et al.  Join Index Hierarchies for Supporting Efficient Navigations in Object-Oriented Databases , 1994, VLDB.

[6]  M. Tamer Özsu,et al.  Query Processing in Object-Oriented Database Systems , 1995, Modern Database Systems.

[7]  Carlo Zaniolo,et al.  Design of relational views over network schemas , 1979, SIGMOD '79.

[8]  Darrell Woelk,et al.  Query Processing in Distributed ORION , 1990, EDBT.

[9]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, J. Comput. Syst. Sci..

[10]  Stanley B. Zdonik,et al.  An object-oriented query algebras , 1989 .

[11]  Paolo Atzeni,et al.  Cut and Paste , 1999, J. Comput. Syst. Sci..

[12]  Sophie Cluet,et al.  A general framework for the optimization of object-oriented queries , 1992, SIGMOD '92.

[13]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[14]  Arnon Rosenthal,et al.  Querying Relational Views of Networks , 1985, Query Processing in Database Systems.

[15]  Paolo Atzeni,et al.  Cut and paste , 1997, PODS '97.

[16]  Carlo Zaniolo,et al.  The database language GEM , 1983, SIGMOD '83.

[17]  Guido Moerkotte,et al.  Access Support Relations: An Indexing Method for Object Bases , 1992, Inf. Syst..

[18]  Alberto O. Mendelzon,et al.  WebOQL: restructuring documents, databases and Webs , 1998, Proceedings 14th International Conference on Data Engineering.

[19]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.