Algorithm of mining sequential patterns for web personalization services

This paper focuses on the requirements of web personalization service for sequential patterns and sequential mining algorithms. Previous sequential mining algorithms treated sequential patterns uniformly, but individual patterns in sequences often have different importance weights. To solve this problem, we propose a new algorithm to identify weighted maximal frequent sequential patterns. First, frequency of frequent single items is used to calculate the weights of frequent sequences. Then, the frequent weighted sequence is defined, leading not only to the discovery of important maximal sequences, but the property of anti-monotony. Web usage mining has been used effectively to inform web personalization and recommender systems, and this new algorithm provides an effective method for optimizing these services. A variety of recommendation frameworks have been proposed previously, including some based on non-sequential models such as association rules, as well as sequential models. In this paper, we present a hybrid web personalization system based on clustering and contiguous sequential patterns. Our system clusters log files to determine the basic architecture of websites, and for each cluster, we use contiguous sequential pattern mining to further optimize the topologies of websites. Finally, we propose two evaluating parameters to test the performance of our system.

[1]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[2]  Jiong Yang,et al.  Mining Sequential Patterns from Large Data Sets , 2005, Advances in Database Systems.

[3]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[4]  George Karypis,et al.  Selective Markov models for predicting Web page accesses , 2004, TOIT.

[5]  F. Masseglia,et al.  Sequential Pattern Mining : A Survey on Issues and Approaches , 2004 .

[6]  Maurice Mulvenna,et al.  Personalization on the Net using Web Mining , 2000 .

[7]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[8]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[9]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[10]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[11]  Florent Masseglia,et al.  The PSP Approach for Mining Sequential Patterns , 1998, PKDD.

[12]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[13]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[14]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .