Incremental Relational Fuzzy Subtractive Clustering for Dynamic Web Usage Profiling

A primary application of web usage profiling is in model-based collaborative filtering (CF) for building recommender systems used for web personalization. CF techniques used for recommendation require accumulation of vast amount of historical user-preference information, which is queried to provide a personalized experience. Model-based CF techniques are preferred over the somewhat more accurate memory-based CF techniques primarily due to their higher efficiency and scalability. In the situation where user interests change dynamically with time, memory-based CF allows easy addition of new usage data, but model-based CF often requires complete remodeling operation. Being computationally intensive, remodeling is done occasionally and the models used normally lag behind the current usage patterns leading to irrelevant and mistargeted recommendations. Even though development of maintenance schemes which adapt these models to non-stationary environments is very important, it has received less attention so far. Our first contribution in this paper is a web usage profile maintenance scheme using a new algorithm called incremental Relational Fuzzy Subtractive Clustering (RFSC). Incremental RFSC can efficiently add new usage data to an existing model overcoming the expense associated with frequent remodeling. We validate the results by showing close similarity between complete reclustering and the clustering models obtained after applying our incremental RFSC technique. Any maintenance scheme based on incremental update of the profile requires a measure to indicate, to the web analyst, when accumulated usage data has to be reclustered; otherwise continued maintenance leads to irrelevant, obsolete model. The second contribution of this paper is thus introduction of a quantitative measure, called impact factor. When the impact factor exceeds a predefined threshold, a remodeling is recommended. We conducted extensive experiments which compare the effectiveness of recommendations using our incremental RFSC technique, complete reclustering, and memory-based CF techniques. The results obtained indicate that our maintenance technique is almost as good as complete remodeling.

[1]  Anupam Joshi,et al.  Extracting Web User Profiles Using Relational Competitive Fuzzy Clustering , 2000, Int. J. Artif. Intell. Tools.

[2]  Farnoush Banaei Kashani,et al.  A Framework for Efficient and Anonymous Web Usage Mining Based on Client-Side Tracking , 2001, WEBKDD.

[3]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[4]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Rajeev Motwani,et al.  Incremental Clustering and Dynamic Information Retrieval , 2004, SIAM J. Comput..

[6]  Eric Horvitz,et al.  Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach , 2000, UAI.

[7]  Olfa Nasraoui,et al.  World Wide Web Personalization , 2005 .

[8]  Olfa Nasraoui,et al.  Mining Evolving User Profiles in Noisy Web Clickstream Data with a Scalable Immune System Clustering Algorithm , 2003 .

[9]  L. Hubert,et al.  Comparing partitions , 1985 .

[10]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[11]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[12]  James C. Bezdek,et al.  On relational data versions of c-means algorithms , 1996, Pattern Recognit. Lett..

[13]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[14]  Bamshad Mobasher,et al.  Web Usage Mining and Personalization , 2004, The Practical Handbook of Internet Computing.

[15]  Nematollaah Shiri,et al.  An Efficient Technique for Mining Usage Profiles Using Relational Fuzzy Subtractive Clustering , 2005, International Workshop on Challenges in Web Information Retrieval and Integration.

[16]  Anupam Joshi,et al.  Automatic Web User Profiling and Personalization Using Robust Fuzzy Relational Clustering , 2002 .

[17]  Dimitris K. Tasoulis,et al.  Unsupervised clustering on dynamic databases , 2005, Pattern Recognit. Lett..

[18]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[19]  Fazli Can,et al.  A dynamic cluster maintenance system for information retrieval , 1987, SIGIR '87.

[20]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.