Mining Access Patterns Eeciently from Web Logs ?

With the explosive growth of data available on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice. In this paper, we study the problem of mining access patterns from Web logs e ciently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for e cient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can nd access patterns from Web logs quite e ciently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.

[1]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[2]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[3]  John Graham-Cumming,et al.  Hits and Miss-es: A Year Watching the Web , 1997, Comput. Networks.

[4]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[5]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Myra Spiliopoulou,et al.  WUM: A tool for Web Utilization analysis , 1999 .

[8]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[9]  Sushil Jajodia,et al.  Mining Temporal Relationships with Multiple Granularities in Time Sequences , 1998, IEEE Data Eng. Bull..

[10]  Saul Greenberg,et al.  How people revisit web pages: empirical findings and implications for the design of history systems , 1997, Int. J. Hum. Comput. Stud..