Policies for Caching OLAP Queries in Internet Proxies

The Internet now offers more than just simple information to the users. Decision makers can now issue analytical, as opposed to transactional, queries that involve massive data (such as, aggregations of millions of rows in a relational database) in order to identify useful trends and patterns. Such queries are often referred to as Online-analytical processing (OLAP). Typically, pages carrying query results do not exhibit temporal locality and, therefore, are not considered for caching at Internet proxies. In OLAP processing, this is a major problem as the cost of these queries is significantly larger than that of the transactional queries. This paper proposes a technique to reduce the response time for OLAP queries originating from geographically distributed private LANs and issued through the Web toward a central data warehouse (DW) of an enterprise. An active caching scheme is introduced that enables the LAN proxies to cache some parts of the data, together with the semantics of the DW, in order to process queries and construct the resulting pages. OLAP queries arriving at the proxy are either satisfied locally or from the DW, depending on the relative access costs. We formulate a cost model for characterizing the respective latencies, taking into consideration the combined effects of both common Web access and query processing. We propose a cache admittance and replacement algorithm that operates on a hybrid Web-OLAP input, outperforming both pure-Web and pure-OLAP caching schemes

[1]  Sriram Padmanabhan,et al.  Scalable template-based query containment checking for Web semantic caches , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[2]  Martin F. Arlitt,et al.  Evaluating content management techniques for Web proxy caches , 2000, PERV.

[3]  Jennifer Widom,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications , 1999, IEEE Data Eng. Bull..

[4]  Neal E. Young,et al.  On-line caching as cache size varies , 1991, SODA '91.

[5]  Nikolaos Laoutaris,et al.  On the optimization of storage capacity allocation for content distribution , 2005, Comput. Networks.

[6]  Lixia Zhang,et al.  On the placement of Internet instrumentation , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[7]  Timos K. Sellis,et al.  Data Warehouse Configuration , 1997, VLDB.

[8]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[9]  Azer Bestavros,et al.  Changes in Web client access patterns: Characteristics and caching implications , 1999, World Wide Web.

[10]  Alexandros Labrinidis,et al.  WebView materialization , 2000, SIGMOD '00.

[11]  Marc Abrams,et al.  Proxy Caching That Estimates Page Load Delays , 1997, Comput. Networks.

[12]  Qiong Luo,et al.  Template-Based Proxy Caching for Table-Valued Functions , 2004, DASFAA.

[13]  Luigi Rizzo,et al.  Replacement policies for a proxy cache , 2000, TNET.

[14]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[15]  Beng Chin Ooi,et al.  An adaptive peer-to-peer network for distributed caching of OLAP results , 2002, SIGMOD '02.

[16]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[17]  Jin Zhang,et al.  Active Cache: caching dynamic contents on the Web , 1999, Distributed Syst. Eng..

[18]  Peter Scheuermann,et al.  WATCHMAN : A Data Warehouse Intelligent Cache Manager , 1996, VLDB.

[19]  Bo Li,et al.  On the optimal placement of web proxies in the Internet , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[20]  Peter Scheuermann,et al.  Proxy Cache Algorithms: Design, Implementation, and Performance , 1999, IEEE Trans. Knowl. Data Eng..

[21]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[22]  Diego Calvanese,et al.  View-based query containment , 2003, PODS '03.

[23]  Azer Bestavros,et al.  WWW traffic reduction and load balancing through server-based caching , 1997, IEEE Concurrency.

[24]  Dan Suciu,et al.  Query Caching and View Selection for XML Databases , 2005, VLDB.

[25]  Panos Kalnis,et al.  Proxy-server architectures for OLAP , 2001, SIGMOD '01.

[26]  Panos Kalnis,et al.  Active caching of on-line-analytical-processing queries in WWW proxies , 2001, International Conference on Parallel Processing, 2001..

[27]  Jeffrey F. Naughton,et al.  Active Query Caching for Database Web Servers , 2000, WebDB.

[28]  Nick Roussopoulos,et al.  DynaMat: a dynamic view management system for data warehouses , 1999, SIGMOD '99.

[29]  Edward A. Fox,et al.  Caching Proxies: Limitations and Potentials , 1995, WWW.

[30]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[31]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[32]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[33]  Duane Wessels,et al.  Internet Cache Protocol (ICP), version 2 , 1997, RFC.

[34]  Azer Bestavros,et al.  Popularity-aware greedy dual-size Web proxy caching algorithms , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[35]  Hamid Pirahesh,et al.  A Framework for Using Materialized XPath Views in XML Query Processing , 2004, VLDB.

[36]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[37]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize Under a Maintenance Cost Constraint , 1999, ICDT.

[38]  Margo I. Seltzer,et al.  The case for geographical push-caching , 1995, Proceedings 5th Workshop on Hot Topics in Operating Systems (HotOS-V).