Clustering navigation patterns on a website using a Sequence Alignment Method

In this paper, a new method is illustrated to cluster navigation patterns on a website. Instead of clustering users by means of a Euclidean distance measure, in our approach users are partitioned into clusters using a Sequence Alignment Method. This method ensures that sequential relationships, which are captured in the data, are taken into account. The performance of the algorithm is compared with the results of a method based on Euclidean distance measures. The proposed method is validated using usertraffic data from a Belgian telecom provider. Empirical results show that the method extracts sequences with similar behavioural patterns not only with regard to content but also considering the order of pages that are visited in a sequence.

[1]  Editors , 1986, Brain Research Bulletin.

[2]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[3]  Heikki Mannila,et al.  Similarity of event sequences , 1997, Proceedings of TIME '97: 4th International Workshop on Temporal Representation and Reasoning.

[4]  W C Wilson,et al.  Activity Pattern Analysis by Means of Sequence-Alignment Methods , 1998 .

[5]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[6]  PatternsYongjian,et al.  Clustering of Web Users Based on Access , 1999 .

[7]  P. Tan,et al.  WebSIFT : The Web Site Information Filter , 1999 .

[8]  Maurice Mulvenna,et al.  Navigation Pattern Discovery from Internet Data , 1999 .

[9]  Kaizhong Zhang,et al.  An Index Structure for Data Mining and Clustering , 2000, Knowledge and Information Systems.

[10]  Bamshad Mobasher,et al.  Discovery of Aggregate Usage Profiles for Web Personalization , 2000 .

[11]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[12]  Jaideep Srivastava,et al.  Web usage mining: discovery and application of interesting patterns from web data , 2000 .

[13]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[14]  Ta Theo Arentze,et al.  A Position-Sensitive Sequence-Alignment Method Illustrated for Space–Time Activity-Diary Data , 2001 .

[15]  Ta Theo Arentze,et al.  Activity pattern similarity : a multidimensional sequence alignment method , 2002 .