An online PPM prediction model for web prefetching

Web prefetching is a primary means to reduce user access latency. An important amount of work can be found by the use of PPM (Prediction by Partial Match) for modeling and predicting user request patterns in the open literature. However, in general, existing PPM models are constructed off-line. It is highly desirable to perform the online update of the PPM model incrementally because user request patterns may change over time. We present an online PPM model to capture the changing patterns and fit the memory. This model is implemented based on a noncompact suffix tree. Our model only keeps the most recent W requests using a sliding window. To further improve the prefetching performance, we make use of maximum entropy principle to model for the outgoing probability distributions of nodes. Our prediction model combines entropy, prediction accuracy rate and the longest match rule. A performance evaluation is presented using real web logs. Trace-driven simulation results show our PPM prediction model can provide significant improvements over previously proposed models.

[1]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[2]  N. Jesper Larsson Extended application of suffix trees to data compression , 1996, Proceedings of Data Compression Conference - DCC '96.

[3]  George Karypis,et al.  Selective Markov models for predicting Web page accesses , 2004, TOIT.

[4]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[5]  Themistoklis Palpanas,et al.  Web prefetching using partial match prediction , 1998 .

[6]  D. M. Hutton,et al.  Web Dynamics - Adapting to Change in Content, Size, Topology and Use , 2006 .

[7]  Yannis Manolopoulos,et al.  A Data Mining Algorithm for Generalized Web Prefetching , 2003, IEEE Trans. Knowl. Data Eng..

[8]  Carey Williamson,et al.  Locality Characteristics of Web Streams Revisited , 2005 .

[9]  Cheng-Zhong Xu,et al.  A keyword-based semantic prefetching approach in Internet news services , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Ajay D. Kshemkalyani,et al.  Objective-optimal algorithms for long-term Web prefetching , 2006, IEEE Transactions on Computers.

[11]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[12]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[13]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[14]  Brian D. Davison Learning Web Request Patterns , 2004, Web Dynamics.

[15]  Jim Griffioen,et al.  Reducing File System Latency using a Predictive Approach , 1994, USENIX Summer.

[16]  Wei Lin,et al.  Web prefetching between low-bandwidth clients and proxies: potential and performance , 1999, SIGMETRICS '99.

[17]  N. Jesper Larsson Structures of String Matching and Data Compression , 1999 .