Profile-driven cache management

Modern distributed information systems cope with disconnection and limited bandwidth by using caches. In communication-constrained situations, traditional demand-driven approaches are inadequate. Instead, caches must be preloaded in order to mitigate the absence of connectivity or the paucity of bandwidth. We propose to use application-level knowledge expressed as profiles to manage the contents of caches. We propose a simple, but rich profile language that permits high-level expression of a user's data needs for the purpose of expressing desirable contents of a cache. We consider techniques for prefetching a cache on the basis of profiles expressed in our framework, both for basic and preemptive prefetching, the latter referring to the case where staging a cache can be interrupted at any point without prior warning. We examine the effectiveness of three profile processing techniques, and show that the rich expressivity of our profile language does not prevent a fairly simple greedy algorithm from being an effective processing technique. We also show that for a large shared cache, multiple clients' profiles can be combined into a single superprofile that is representative of them all, but that when the number of clients with profiles is significantly large, a randomized approach is more scalable than a greedy approach. We believe that profiles, as described, are an enabling technology that could spawn a rich new area of research beyond cache management into network data management in general.

[1]  Hector Garcia-Molina,et al.  The SIFT information dissemination system , 1999, TODS.

[2]  Hector Garcia-Molina,et al.  Synchronizing a database to improve freshness , 2000, SIGMOD '00.

[3]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[4]  C. Lee Giles,et al.  Self-adaptive user profiles for large-scale data delivery , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[5]  Satish Ramakrishnan,et al.  The PointCast Network , 1998, SIGMOD Conference.

[6]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[7]  Patrick Valduriez,et al.  Proceedings of the 2004 ACM SIGMOD international conference on Management of data , 2004, SIGMOD 2004.

[8]  Stanley B. Zdonik,et al.  Expressing user profiles for data recharging , 2001, IEEE Wirel. Commun..

[9]  Stanley B. Zdonik,et al.  Scalable application-aware data freshening , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[10]  Dale Skeen,et al.  The Information Bus: an architecture for extensible distributed systems , 1994, SOSP '93.

[11]  Anupam Joshi,et al.  Profile Driven Data Management for Pervasive Environments , 2002, DEXA.

[12]  Hector Garcia-Molina,et al.  Synchronizing a database to improve freshness , 2000, SIGMOD 2000.

[13]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[14]  Louiqa Raschid,et al.  Using Latency-Recency Profiles for Data Delivery on the Web , 2002, VLDB.

[15]  Rakesh Agrawal,et al.  A framework for expressing and combining preferences , 2000, SIGMOD '00.

[16]  Qi Lu,et al.  Efficient Profile Matching for Large Scale Webcasting , 1998, Comput. Networks.

[17]  Sheau-Dong Lang,et al.  Experiments with the "Oregon Trail Knapsack Problem" , 1999, Electron. Notes Discret. Math..

[18]  Christos H. Papadimitriou,et al.  Computational complexity , 1993 .

[19]  Kevin Knight,et al.  Artificial intelligence (2. ed.) , 1991 .