WUML: A Web Usage Manipulation Language for Querying Web Log Data

In this paper, we develop a novel Web Usage Manipulation Language (WUML) which is a declarative language for manipulating Web log data. We assume that a set of trails formed by users during the navigation process can be identified from Web log files. The trails are dually modelled as a transition graph and a navigation matrix with respect to the underlying Web topology. A WUML expression is executed by transforming it into Navigation Log Algebra (NLA), which consists of the sum, union, difference, intersection, projection, selection, power and grouping operators. As real navigation matrices are sparse, we perform a range of experiments to study the impact of using different matrix storage schemes on the performance of the NLA.

[1]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[2]  Myra Spiliopoulou,et al.  WUM - A Tool for WWW Ulitization Analysis , 1998, WebDB.

[3]  Wilfred Ng Capturing the Semantics of Weg Log Data by Navigation Matrices , 2001, DS-9.

[4]  Y. Saad,et al.  Krylov Subspace Methods on Supercomputers , 1989 .

[5]  Keshav Pingali,et al.  A Framework for Sparse Matrix Code Synthesis from High-level Specifications , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[6]  Myra Spiliopoulou,et al.  Analysis of navigation behaviour in web sites integrating multiple information systems , 2000, The VLDB Journal.

[7]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[8]  Nazli Goharian,et al.  Comparative Analysis of Sparse Matrix Algorithms For Information Retrieval , 2003 .

[9]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[10]  Divyakant Agrawal,et al.  Retrieving and organizing web pages by “information unit” , 2001, WWW '01.

[11]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[13]  Franco Turini,et al.  Preprocessing and Mining Web Log Data for Web Personalization , 2003, AI*IA.