A Learning-Based Approach for Web Cache Management

Web caching has been widely used to alleviate Internet traffic congestion in World Wide Web (WWW) services. To reduce download throughput, an effective strategy on web cache management is needed to exploit web usage information in order to make a decision on evicting the document stored in case of cache saturation. This paper presents a so-called Learning Based Replacement algorithm (LBR), a hybrid approach towards an efficient replacement model for web caching by incorporating a machine learning technique (naive Bayes) into the LRU replacement method to improve prediction of possibility that an existing page will be revised by a succeeding request, from access history in a web log. The learned knowledge includes information on which URL objects in cache should be kept or evicted. The learning-based model is acquired to represent the hidden aspect of user request pattern for predicting the re-reference possibility. By a number of experiments, the LBR gains potential improvement of prediction on revisit probability, hit rate and byte hit rate overtraditional methods; LRU, LFU, and GDSF, respectively.

[1]  Mathias Géry,et al.  Evaluation of web usage mining approaches for user's next request prediction , 2003, WIDM '03.

[2]  Hongjun Lu,et al.  Efficient prediction of web accesses on a proxy server , 2002, CIKM '02.

[3]  Ludmila Cherkasova,et al.  Improving WWW Proxies Performance with Greedy-Dual- Size-Frequency Caching Policy , 1998 .

[4]  Beng Chin Ooi,et al.  Rule-assisted prefetching in Web-server caching , 2000, CIKM '00.

[5]  Gianfranco Ciardo,et al.  Role of Aging, Frequency, and Size in Web Cache Replacement Policies , 2001, HPCN Europe.

[6]  Jukka Heikkonen,et al.  Web cache optimization with nonlinear model using object features , 2003, Comput. Networks.

[7]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[8]  Vir V. Phoha,et al.  An Adaptive Web Cache Access Predictor Using Neural Network , 2002, IEA/AIE.

[9]  Alex Rousskov,et al.  On performance of caching proxies (extended abstract) , 1998, SIGMETRICS '98/PERFORMANCE '98.

[10]  Qiang Yang,et al.  Mining web logs for prediction models in WWW caching and prefetching , 2001, KDD '01.

[11]  Jia Wang,et al.  A survey of web caching schemes for the Internet , 1999, CCRV.

[12]  A. Songwattana,et al.  Mining Web Logs for Prediction in Prefetching and Caching , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[13]  Hao Chen,et al.  A Least Grade Page Replacement Algorithm for Web Cache Optimization , 2008, First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008).

[14]  Duane Wessels,et al.  Web Caching , 2001 .

[15]  G. P. Sajeev,et al.  Building a semi intelligent web cache with light weight machine learning , 2010, 2010 5th IEEE International Conference Intelligent Systems.

[16]  Edward F. Watson,et al.  Model-driven simulation of World-Wide-Web cache policies , 1997, WSC '97.

[17]  Balachander Krishnamurthy,et al.  Web Protocols and Practice - HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement , 2001 .

[18]  Yin-Fu Huang,et al.  Mining web logs to improve hit ratios of prefetching and caching , 2008, Knowl. Based Syst..

[19]  Wenying Feng,et al.  Machine Learning Prediction andWeb Access Modeling , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[20]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[21]  Sang Lyul Min,et al.  Using Full Reference History for Efficient Document Replacement in Web Caches , 1999, USENIX Symposium on Internet Technologies and Systems.

[22]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[23]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[24]  Brian D. Davison Learning Web Request Patterns , 2004, Web Dynamics.

[25]  Luigi Rizzo,et al.  Replacement policies for a proxy cache , 2000, TNET.

[26]  Martin F. Arlitt,et al.  Evaluating content management techniques for Web proxy caches , 2000, PERV.

[27]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[28]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.