Making Web Servers Pushier

The success of the World Wide Web measured in terms of the number of its users and of the resulting traffic increase is only commensurate to the patience required when sitting in front of one's computer, waiting for a document to be down-loaded. If one could identify the typical access patterns for a set of documents on a Web server, the server could use or extend the existing protocols to accordingly pre-fetch or push documents to the browsers and proxy servers. In this paper, we present and evaluate a strategy for making Web servers "pushier". Which document is to be pushed is determined by a set of association rules mined from a sample of the access log of the Web server. Once a rule of the form "Document A → Document B" has been identified and selected, the Web server decides to push "Document2" if "Document1" is requested. The strategy is individual user oriented while not ignoring the aggregate perspective. We evaluate the effectiveness and the cost of such a strategy for two architectures: a two tier "Web server / Web browser" architecture, and a three tier "Web server / proxy server / Web browser" architecture. We consider different settings in the architectures as well as refinements of the strategy taking into account the size of the documents.

[1]  Evangelos P. Markatos,et al.  A top- 10 approach to prefetching on the web , 1996 .

[2]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Carlos R. Cunha,et al.  Determining WWW user's next access and its application to pre-fetching , 1997, Proceedings Second IEEE Symposium on Computer and Communications.

[5]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[6]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[7]  Quinn Jacobson,et al.  Potential and Limits of Web Prefetching Between Low-Bandwidth Clients and Proxies , 1998 .

[8]  Maria L. Gini,et al.  A client-side Web agent for document categorization , 1998, Internet Res..

[9]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[10]  Martin Arlitt,et al.  A Performance Study of Internet Web Servers , 1996 .

[11]  Stanley B. Zdonik,et al.  Balancing push and pull for data broadcast , 1997, SIGMOD '97.

[12]  Azer Bestavros,et al.  Using speculation to reduce server load and service time on the WWW , 1995, CIKM '95.

[13]  Wei Lin,et al.  Web prefetching between low-bandwidth clients and proxies: potential and performance , 1999, SIGMETRICS '99.

[14]  P. Krishnan,et al.  Practical prefetching via data compression , 1993 .

[15]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[16]  Diana Rosenberg,et al.  IT and university libraries in Africa , 1998, Internet Research.

[17]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[18]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.