GROUPING WEB ACCESS SEQUENCES USING SEQUENCE ALIGNMENT METHOD

In web usage mining grouping of web access sequences can be used to determine the behavior or intent of a set of users. Grouping web sessions is how to measure the similarity between web sessions. There are many shortcomings in traditional measurement methods. The task of grouping web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity is done using sequence alignment method. This paper introduces a new method to group web sessions, which considers the global and local alignment techniques of similarity measurement. Where sessions are chronologically ordered sequences of page accessed. Length of sessions also plays its role in measuring similarity.

[1]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[2]  K. Duraiswamy,et al.  Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming , 2008, Comput. Inf. Sci..

[3]  Arindam Banerjee,et al.  Clickstream clustering using weighted longest common subsequences , 2001 .

[4]  Sourav S. Bhowmick,et al.  WAM-Miner: in the search of web access motifs from historical web log data , 2005, CIKM '05.

[5]  Pradeep Kumar,et al.  Rough clustering of sequential data , 2007, Data Knowl. Eng..

[6]  Pier Luca Lanzi,et al.  Mining interesting knowledge from weblogs: a survey , 2005, Data Knowl. Eng..

[7]  Evangelos Theodoridis,et al.  A Web-Page Usage Prediction Scheme Using Weighted Suffix Trees , 2007, SPIRE.

[8]  Osmar R. Zaïane,et al.  Clustering Web sessions by sequence alignment , 2002, Proceedings. 13th International Workshop on Database and Expert Systems Applications.

[9]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[10]  Bamshad Mobasher,et al.  Discovery of Aggregate Usage Profiles for Web Personalization , 2000 .

[11]  K. Vanhoof,et al.  Clustering navigation patterns on a website using a Sequence Alignment Method , 2001 .

[12]  PatternsYongjian,et al.  Clustering of Web Users Based on Access , 1999 .

[13]  Costas S. Iliopoulos,et al.  The Weighted Suffix Tree: An Efficient Data Structure for Handling Molecular Weighted Sequences and its Applications , 2006, Fundam. Informaticae.

[14]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[15]  Evangelos Theodoridis,et al.  A web page usage prediction scheme using sequence indexing and clustering techniques , 2010, Data Knowl. Eng..

[16]  Chaofeng Li Research on Web Session Clustering , 2009, J. Softw..

[17]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[18]  Faten Khalil Combining web data mining techniques for web page access prediction , 2008 .

[19]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[20]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[21]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.