Mining Significant Usage Patterns from Clickstream Data

Discovery of usage patterns from Web data is one of the primary purposes for Web Usage Mining. In this paper, a technique to generate Significant Usage Patterns (SUP) is proposed and used to acquire significant “user preferred navigational trails”. The technique uses pipelined processing phases including sub-abstraction of sessionized Web clickstreams, clustering of the abstracted Web sessions, concept-based abstraction of the clustered sessions, and SUP generation. Using this technique, valuable customer behavior information can be extracted by Web site practitioners. Experiments conducted using Web log data provided by J.C.Penney demonstrate that SUPs of different types of customers are distinguishable and interpretable. This technique is particularly suited for analysis of dynamic websites.

[1]  Mark LeveneDepartment An Average Linear Time Algorithm for WebData MiningJos , 2000 .

[2]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[3]  Yongjian Fu,et al.  A Generalization-Based Approach to Clustering of Web Usage Sessions , 1999, WEBKDD.

[4]  Xianggui Qu,et al.  Multivariate Data Analysis , 2007, Technometrics.

[5]  Andrew Foss,et al.  A non-parametric approach to web log analysis , 2001 .

[6]  PatternsYongjian,et al.  Clustering of Web Users Based on Access , 1999 .

[7]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  K. Vanhoof,et al.  Clustering navigation patterns on a website using a Sequence Alignment Method , 2001 .

[9]  Anupam Joshi,et al.  Extracting Web User Profiles Using Relational Competitive Fuzzy Clustering , 2000, Int. J. Artif. Intell. Tools.

[10]  Margaret H. Dunham,et al.  Efficient mining of traversal patterns , 2001, Data Knowl. Eng..

[11]  Mark Levene,et al.  An Average Linear Time Algorithm For Web Usage Mining , 2004, Int. J. Inf. Technol. Decis. Mak..

[12]  Wendy W. Moe,et al.  The Influence of Goal‐Directed and Experiential Activities on Online Flow Experiences , 2003 .

[13]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[14]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[15]  Maurice Mulvenna,et al.  Navigation Pattern Discovery from Internet Data , 1999 .

[16]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[17]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[18]  Arindam Banerjee,et al.  Clickstream clustering using weighted longest common subsequences , 2001 .

[19]  Matthias Baumgarten,et al.  User-Driven Navigation Pattern Discovery from Internet Data , 1999, WEBKDD.

[20]  Osmar R. Zaïane,et al.  Clustering Web sessions by sequence alignment , 2002, Proceedings. 13th International Workshop on Database and Expert Systems Applications.

[21]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[22]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[23]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[24]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[25]  Anupam,et al.  Mining Web Access Logs Using Relational Competitive Fuzzy Clustering , 1999 .

[26]  George Karypis,et al.  C HAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling , 1999 .

[27]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[28]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .