Intelligent Web caching using Adaptive Regression Trees, Splines, Random Forests and Tree Net

Web caching is a technology for improving network traffic on the internet. It is a temporary storage of Web objects (such as HTML documents) for later retrieval. There are three significant advantages to Web caching; reduced bandwidth consumption, reduced server load, and reduced latency. These rewards have made the Web less expensive with better performance. The aim of this research is to introduce advanced machine learning approaches for Web caching to decide either to cache or not to the cache server, which could be modelled as a classification problem. The challenges include identifying attributes ranking and significant improvements in the classification accuracy. Four methods are employed in this research; Classification and Regression Trees (CART), Multivariate Adaptive Regression Splines (MARS), Random Forest (RF) and TreeNet (TN) are used for classification on Web caching. The experimental results reveal that CART performed extremely well in classifying Web objects from the existing log data and an excellent attribute to consider for an accomplishment of Web cache performance enhancement.

[1]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[2]  Karim O. Elish,et al.  Application of TreeNet in Predicting Object-Oriented Software Maintainability: A Comparative Study , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[3]  J. Friedman Multivariate adaptive regression splines , 1990 .

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Alex Rousskov,et al.  On performance of caching proxies (extended abstract) , 1998, SIGMETRICS '98/PERFORMANCE '98.

[6]  Jerome H Friedman,et al.  Multiple additive regression trees with application in epidemiology , 2003, Statistics in medicine.

[7]  Timo Koskela,et al.  Neural network methods in analysing and modelling time varying processes , 2003 .

[8]  J. Friedman Stochastic gradient boosting , 2002 .

[9]  Farhan Mohamed Intelligent web caching architecture , 2006 .

[10]  Shih-Hao Hung,et al.  Optimizing the Embedded Caching and Prefetching Software on a Network-Attached Storage System , 2008, 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[11]  Ajith Abraham,et al.  MARS: Still an Alien Planet in Soft Computing? , 2001, International Conference on Computational Science.

[12]  Ajith Abraham,et al.  Intelligent Web Caching for E-learning Log Data , 2009, 2009 Third Asia International Conference on Modelling & Simulation.

[13]  Alex Rousskov On Performance of Caching Proxies , 1998, SIGMETRICS 1998.

[14]  Servane Gey,et al.  Model selection for CART regression trees , 2005, IEEE Transactions on Information Theory.

[15]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[16]  Ji-Hyun Lee,et al.  An adaptive website system to improve efficiency with web mining techniques , 2004, Adv. Eng. Informatics.

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  Achuthsankar S. Nair,et al.  Dynamic Web Pre-fetching Technique for Latency Reduction , 2008 .

[19]  Daniel Dajun Zeng,et al.  An overview of World Wide Web caching , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[20]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[21]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .

[22]  Ajith Abraham,et al.  An Implementation of Rough Set in Optimizing Mobile Web Caching Performance (Invited Paper) , 2008, Tenth International Conference on Computer Modeling and Simulation (uksim 2008).

[23]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[24]  Siti Mariyam Shamsuddin,et al.  Rough Web Caching , 2009 .

[25]  Ajith Abraham,et al.  Intelligent Web Caching Using Neurocomputing and Particle Swarm Optimization Algorithm , 2008, 2008 Second Asia International Conference on Modelling & Simulation (AMS).

[26]  George Pallis,et al.  A clustering-based prefetching scheme on a Web cache environment , 2008, Comput. Electr. Eng..

[27]  Ming-Syan Chen,et al.  Integrating Web Caching and Web Prefetching in Client-Side Proxies , 2005, IEEE Trans. Parallel Distributed Syst..

[28]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[29]  Azer Bestavros,et al.  A Prefetching Protocol Using Client Speculation for the WWW , 1995 .