Linear time algorithms for finding maximal forward references

In this paper, two algorithms are designed for finding maximal forward references from very large Web logs, longest sequences of Web pages visited by a user without revisiting some previously visited page in the sequence, and their performance is comparatively analyzed. It is shown that the two algorithms have linear (hence optimal) time complexity.

[1]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[2]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[3]  Jaideep Srivastava,et al.  Grouping Web page references into transactions for mining World Wide Web browsing patterns , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[4]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[5]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[6]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[7]  Peter Pirolli,et al.  Mining Longest Repeating Subsequences to Predict World Wide Web Surfing , 1999, USENIX Symposium on Internet Technologies and Systems.

[8]  Michael D. Smith,et al.  Using Path Profiles to Predict HTTP Requests , 1998, Comput. Networks.

[9]  Florent Masseglia,et al.  An efficient algorithm for Web usage mining , 1999 .

[10]  Zhixiang Chen,et al.  Fast Construction of Generalized Suffix Trees Over a Very Large Alphabet , 2003, COCOON.

[11]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[12]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[13]  Maurice D. Mulvenna,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998, SGMD.

[14]  G BüchnerAlex,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998 .

[15]  Zhixiang Chen,et al.  Optimal Algorithms for Finding User Access Sessions from Very Large Web Logs , 2002, PAKDD.

[16]  Zhixiang Chen,et al.  Linear and sublinear time algorithms for mining frequent traversal path patterns from very large Web logs , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..