Generating Dynamic Higher-Order Markov Models in Web Usage Mining

Markov models have been widely used for modelling users’ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter.

[1]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[2]  Michael D. Smith,et al.  Using Path Profiles to Predict HTTP Requests , 1998, Comput. Networks.

[3]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[4]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 1999, Artif. Intell..

[5]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[6]  José Luis Cabral de Moura Borges,et al.  A data mining model to capture user web navigation patterns , 2000 .

[7]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[8]  Junyi Shen,et al.  A new Markov model for Web access prediction , 2002, Comput. Sci. Eng..

[9]  Jun Hong,et al.  Using Markov models for web site link prediction , 2002, HYPERTEXT '02.

[10]  Myra Spiliopoulou,et al.  Web Usage Analysis and User Profiling , 2002, Lecture Notes in Computer Science.

[11]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[12]  Xin Chen,et al.  A Popularity-Based Prediction Model for Web Prefetching , 2003, Computer.

[13]  Mark Levene,et al.  Computing the Entropy of User Navigation in the Web , 2003, Int. J. Inf. Technol. Decis. Mak..

[14]  Mark Levene,et al.  An Average Linear Time Algorithm For Web Usage Mining , 2004, Int. J. Inf. Technol. Decis. Mak..

[15]  Bamshad Mobasher,et al.  Web Usage Mining and Personalization , 2004, The Practical Handbook of Internet Computing.

[16]  George Karypis,et al.  Selective Markov models for predicting Web page accesses , 2004, TOIT.

[17]  Mark Levene,et al.  A Dynamic Clustering-Based Markov Model for Web Usage Mining , 2004, ArXiv.