Semantic caching of Web queries

Abstract. In meta-searchers accessing distributed Web-based information repositories, performance is a major issue. Efficient query processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve data. In this work, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive Web queries based on signature files. Our algorithms cope with both relations of semantic containment and intersection between a query and the corresponding cache items. We also develop the cache replacement strategy to treat situations when cached items differ in size and contribution when providing partial query answers. We report results of experiments and show how the caching mechanism is realized in the Knowledge Broker system.

[1]  Dik Lun Lee,et al.  Efficient Signature File Methods for Text Retrieval , 1995, IEEE Trans. Knowl. Data Eng..

[2]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[3]  Craig A. Knoblock,et al.  Intelligent caching: selecting, representing, and reusing data in an information server , 1994, CIKM '94.

[4]  Jean-Marc Andreoli,et al.  Constraint-Based Knowledge Brokers , 1994, PASCO.

[5]  Divesh Srivastava,et al.  Using LDAP directory caches , 1999, PODS '99.

[6]  Philip S. Yu,et al.  Caching on the World Wide Web , 1999, IEEE Trans. Knowl. Data Eng..

[7]  Christos Faloutsos,et al.  Description and performance analysis of signature file methods for office filing , 1987, TOIS.

[8]  Akitoshi Yoshida,et al.  MOWS: Distributed Web and Cache Server in Java , 1997, Comput. Networks.

[9]  Birgitta König-Ries,et al.  An Architecture for Transparent Access to Semantically Heterogeneous Information Sources , 1997, CIA.

[10]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[11]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[12]  Kevin Chen-Chuan Chang,et al.  Boolean Query Mapping Across Heterogeneous Information Sources , 1996, IEEE Trans. Knowl. Data Eng..

[13]  Alon Y. Halevy,et al.  Using Probabilistic Information in Data Integration , 1997, VLDB.

[14]  Boris Chidlovskii,et al.  Signature File Methods for Semantic Query Caching , 1998, ECDL.

[15]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Luis Gravano,et al.  Mediating and Metasearching on the Internet , 1998, IEEE Data Eng. Bull..

[17]  Christos Faloutsos,et al.  Signature files: design and performance comparison of some signature extraction methods , 1985, SIGMOD Conference.

[18]  Jeffrey D. Ullman,et al.  MedMaker: a mediation system based on declarative specifications , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[19]  Miron Livny,et al.  Data caching tradeoffs in client-server DBMS architectures , 1991, SIGMOD '91.

[20]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[21]  Divesh Srivastava,et al.  Semantic Data Caching and Replacement , 1996, VLDB.

[22]  Jeffrey D. Ullman,et al.  A Query Translation Scheme for Rapid Implementation of Wrappers , 1995, DOOD.

[23]  Christos Faloutsos,et al.  Signature files: an access method for documents and its analytical performance evaluation , 1984, TOIS.

[24]  K. Selçuk Candan,et al.  Query caching and optimization in distributed mediator systems , 1996, SIGMOD '96.

[25]  Remo Pareschi,et al.  Constraint-Based Protocols for Distributed Problem Solving , 1998, Sci. Comput. Program..

[26]  Patrick Martin,et al.  Data caching strategies for distributed full text retrieval systems , 1991, Inf. Syst..

[27]  Hiroyuki Kitagawa,et al.  Estimation of False Drops in Set-valued Object Retrieval with Signature Files , 1993, FODO.

[28]  Boris Chidlovskii,et al.  Semantic Cache Mechanism for Heterogeneous Web Querying , 1999, Comput. Networks.

[29]  Jarek Gryz,et al.  Semantic Query Caching for Hetereogeneous Databases , 1997, KRDB.