Efficient sequential access pattern mining for web recommendations

Sequential access pattern mining discovers interesting and frequent user access patterns from web logs. Most of the previous studies have adopted Apriori-like sequential pattern mining techniques, which faced the problem on requiring expensive multiple scans of databases. More recent algorithms that are based on the Web Access Pattern tree (or WAP-tree) can achieve an order of magnitude faster than traditional Apriori-like sequential pattern mining techniques. However, the use of conditional search strategies in WAP-tree based mining algorithms requires re-construction of large numbers of intermediate conditional WAP-trees during mining process, which is also very costly. In this paper, we propose an efficient sequential access pattern mining algorithm, known as CSB-mine (Conditional Sequence Base mining algorithm). The proposed CSB-mine algorithm is based directly on the conditional sequence bases of each frequent event which eliminates the need for constructing WAP-trees. This can improve the efficiency of the mining process significantly compared with WAP-tree based mining algorithms, especially when the support threshold becomes smaller and the size of database gets larger. In this paper, the proposed CSB-mine algorithm and its performance will be discussed. In addition, we will also discuss a sequential access-based web recommender system that has incorporated the CSB-mine algorithm for web recommendations.

[1]  Hector Garcia-Molina,et al.  The SIFT information dissemination system , 1999, TODS.

[2]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[3]  T. Joachims WebWatcher : A Tour Guide for the World Wide Web , 1997 .

[4]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Dennis McLeod,et al.  Yoda: An Accurate and Scalable Web-Based Recommendation System , 2001, CoopIS.

[8]  Yi Lu,et al.  Position Coded Pre-order Linked WAP-Tree for Web Log Sequential Pattern Mining , 2003, PAKDD.

[9]  S. C. Hui,et al.  CS-Mine: An Efficient WAP-Tree Mining for Web Access Patterns , 2004, APWeb.

[10]  Tao Luo,et al.  Effective personalization based on association rule discovery from web usage data , 2001, WIDM '01.

[11]  Carolina Ruiz,et al.  FS-Miner: An Efficient and Incremental System to Mine Contiguous Frequent Sequences , 2003 .

[12]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[15]  Dhananjay S. Phatak,et al.  Clustering for personalized mobile Web usage , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[16]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[17]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[18]  Thorsten Joachims,et al.  Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.

[19]  Sergio A. Alvarez,et al.  Collaborative Recommendation via Adaptive Association Rule Mining , 2000 .

[20]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.