Enhanced Integrated Approach to Predict Web User's Future Requests using K-Means and FP-Growth

The tremendous growth in the World Wide Web has led to the user perceived latency when requesting for resources from the web servers. Millions of users are connected to the web server for different needs. To improve the performance of the servers, caching is used where the frequently accessed pages are stored in proxy server caches. Pre-fetching of web pages is the new research area which when used with caching greatly increases the performance. In this paper, a better algorithm for predicting the web pages is proposed. Clustering of web users according to their location using K-Means clustering is done and then each cluster is mined using FP-Growth algorithm to find the association rules and predict the pages to be prefetched for storing in cache.

[1]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[2]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[3]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[4]  László Böszörményi,et al.  A survey of Web cache replacement strategies , 2003, CSUR.

[5]  S. Sitharama Iyengar,et al.  Faster Web Page Allocation with Neural Networks , 2002, IEEE Internet Comput..

[6]  Myra Spiliopoulou,et al.  Web Usage Analysis and User Profiling: International WEBKDD'99 Workshop San Diego, CA, USA, August 15, 1999 Revised Papers , 2000 .

[7]  Athena Vakali,et al.  An Overview of Web Data Clustering Practices , 2004, EDBT Workshops.

[8]  Vipin Kumar,et al.  Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning (Distinguished Paper) , 2000, Euro-Par.

[9]  PatternsYongjian,et al.  Clustering of Web Users Based on Access , 1999 .

[10]  K. Chinen,et al.  An Interactive Prefetching Proxy Server for Improvement of WWW Latency , 1997 .

[11]  Kyuseok Shim,et al.  Data mining and the Web: past, present and future , 1999, WIDM '99.

[12]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[13]  Ming-Syan Chen,et al.  Integrating Web Caching and Web Prefetching in Client-Side Proxies , 2005, IEEE Trans. Parallel Distributed Syst..