论文信息 - CS-Mine: An Efficient WAP-Tree Mining for Web Access Patterns

CS-Mine: An Efficient WAP-Tree Mining for Web Access Patterns

Much research has been done on discovering interesting and frequent user access patterns from web logs. Recently, a novel data structure, known as Web Access Pattern Tree (or WAP-tree), was developed. The associated WAP-mine algorithm is obviously faster than traditional sequential pattern mining techniques. However, WAP-mine requires re-constructing large numbers of intermediate conditional WAP-trees during mining, which is also very costly. In this paper, we propose an efficient WAP-tree mining algorithm, known as CS-mine (Conditional Sequence mining algorithm), which is based directly on the initial conditional sequence base of each frequent event and eliminates the need for re-constructing intermediate conditional WAP-trees. This can improve significantly on efficiency comparing with WAP-mine, especially when the support threshold becomes smaller and the size of database gets larger.

[1] Ramakrishnan Srikant,et al. Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2] Ramakrishnan Srikant,et al. Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[3] Jian Pei,et al. Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[4] Hendrik Blockeel,et al. Web mining research: a survey , 2000, SKDD.

[5] Carolina Ruiz,et al. FS-Miner: An Efficient and Incremental System to Mine Contiguous Frequent Sequences , 2003 .

[6] Yi Lu,et al. Position Coded Pre-order Linked WAP-Tree for Web Log Sequential Pattern Mining , 2003, PAKDD.