Mining Maximal Frequent Access Sequences Based on Improved WAP-tree

It is worthwhile to analyze user's access patterns by capturing maximal access sequences from Web usage data in practice. Web access pattern tree (WAP-tree) stores the highly compressed access sequences, and mining frequent access sequences based on WAP-tree needs to scan transaction database only twice. However, producing conditional WAP-tree repeatedly in the algorithm influences the efficiency in a certain degree. Considering the shortage of WAP-tree, combined with the need of mining maximal access sequences, this paper improves WAP-tree and introduces restrained sub tree structure to solve the problem that a mass of conditional WAP-tree is built in the traditional algorithm. In addition, restrained sub trees inherit the nodes of WAP-tree so that memory is saves. The results of experiments show the efficiency of the improved algorithm