Cooperative XPath caching

Motivated by the fact that XML is increasingly being used in distributed applications, we propose building a cooperative caching scheme for XML documents. Our scheme allows sharing cache content among a number of peers. To facilitate sharing, a distributed prefix-based index is built based on the queries whose results are cached. In the loosely-coupled sharing approach, each peer stores in its local cache results of its own queries and just publishes the associated queries to the index. In the tightly-coupled approach, each peer is assigned a specific part of the query space and stores in its local cache the results of the corresponding queries. Both approaches result in a dynamic organization of content that evolves over time based on the query load, the number of peers and the overall storage available. We present a number of associated design choices such as using a DHT for distributing the prefix-based index and a proactive cache replacement policy. We also report on a number of experiments that show the benefits of cooperative caching and highlight the pros and cons of loosely and tightly coupled cache sharing.

[1]  Azer Bestavros,et al.  Mistreatment in Distributed Caching Groups: Causes and Implications , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[2]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[3]  Georges Gardarin,et al.  MediaPeer: A Safe, Scalable P2P Architecture for XML Query Processing , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[4]  Antony I. T. Rowstron,et al.  Squirrel: a decentralized peer-to-peer web cache , 2002, PODC '02.

[5]  Hamid Pirahesh,et al.  A Framework for Using Materialized XPath Views in XML Query Processing , 2004, VLDB.

[6]  Jeffrey F. Naughton,et al.  Generating Synthetic Complex-Structured XML Data , 2001, WebDB.

[7]  Mong-Li Lee,et al.  Efficient Mining of XML Query Patterns for Caching , 2003, VLDB.

[8]  Elke A. Rundensteiner,et al.  A fine-grained replacement strategy for XML query cache , 2002, WIDM '02.

[9]  Edward Fredkin,et al.  Trie memory , 1960, Commun. ACM.

[10]  Sihem Amer-Yahia,et al.  Distributed evaluation of network directory queries , 2004, IEEE Transactions on Knowledge and Data Engineering.

[11]  Alfredo Cuzzocrea,et al.  XPath lookup queries in P2P networks , 2004, WIDM '04.

[12]  Katja Hose,et al.  Distributed Query Processing in P2P Systems with Incomplete Schema Information , 2004, DIWeb.

[13]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[14]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[15]  Beng Chin Ooi,et al.  BATON: A Balanced Tree Structure for Peer-to-Peer Networks , 2005, VLDB.

[16]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[17]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[18]  Karl Aberer,et al.  Distributed cache table: efficient query-driven processing of multi-term queries in P2P networks , 2006, P2PIR '06.

[19]  Divyakant Agrawal,et al.  A peer-to-peer framework for caching range queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[20]  Divesh Srivastava,et al.  Semantic Data Caching and Replacement , 1996, VLDB.

[21]  Karl Aberer,et al.  Efficient Processing of XPath Queries with Structured Overlay Networks , 2005, OTM Conferences.

[22]  Qiang Wang A Data Locating Mechanism for Distributed XML Data over P2P Networks , 2004 .

[23]  Dan Suciu,et al.  Query Caching and View Selection for XML Databases , 2005, VLDB.

[24]  David J. DeWitt,et al.  Locating Data Sources in Large Distributed Systems , 2003, VLDB.

[25]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[26]  Evaggelia Pitoura,et al.  Content-Based Routing of Path Queries in Peer-to-Peer Systems , 2004, EDBT.

[27]  Srinivasan Seshan,et al.  Cache-and-query for wide area sensor databases , 2003, SIGMOD '03.

[28]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[29]  Sriram Ramabhadran,et al.  A case study in building layered DHT applications , 2005, SIGCOMM '05.

[30]  Karl Aberer,et al.  P-Grid: A Self-Organizing Access Structure for P2P Information Systems , 2001, CoopIS.