On mining webclick streams for path traversal patterns

Mining user access patterns from a continuous stream of Web-clicks presents new challenges over traditional Web usage mining in a large static Web-click database. Modeling user access patterns as maximal forward references, we present a single-pass algorithm StreamPath for online discovering frequent path traversal patterns from an extended prefix tree-based data structure which stores the compressed and essential information about user's moving histories in the stream. Theoretical analysis and performance evaluation show that the space requirement of StreamPath is limited to a logarithmic boundary, and the execution time, compared with previous multiple-pass algorithms [2], is fast.