Expectation maximization clustering algorithm for user modeling in web usage mining system

To provide intelligent personalized online services such as web recommender systems, it is usually necessary to model users’ web access behavior. To achieve this, one of the promising approaches is web usage mining, which mines web logs for user models and recommendations. Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user’s navigation pattern. The model is based on expectation-maximization (EM) algorithm and it is used for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment. The results also indicate that kind of behavior given by EM clustering algorithm has improved the visit-coherence (accuracy) of navigation pattern mining.

[1]  Mehrdad Jalali,et al.  A Web Usage Mining Approach Based on LCS Algorithm in Online Predicting Recommendation Systems , 2008, 2008 12th International Conference Information Visualisation.

[2]  Oren Etzioni,et al.  Adaptive Web sites , 2000, CACM.

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[5]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[6]  Pedro M. Domingos,et al.  Adaptive Web Navigation for Wireless Devices , 2001, IJCAI.

[7]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[8]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[9]  Mehrdad Jalali,et al.  OPWUMP: An Architecture for Online Predicting in WUM-Based Personalization System , 2008 .

[10]  Brigitte Trousse,et al.  Advanced data preprocessing for intersites Web usage mining , 2004, IEEE Intelligent Systems.

[11]  Oren Etzioni,et al.  Adaptive Web Sites: Conceptual Cluster Mining , 1999, IJCAI.

[12]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.