User Behaviour Pattern Mining from Weblog

In this paper, the authors build a tree using both frequent as well as non-frequent items and named as Revised PLWAP with Non-frequent Items RePLNI-tree in single scan. While mining sequential patterns, the links related to the non-frequent items are virtually discarded. Hence, it is not required to delete or maintain the information of nodes while revising the tree for mining updated weblog. It is not required to reconstruct the tree from scratch and re-compute the patterns each time, while weblog is updated or minimum support changed, since the algorithm supports both incremental and interactive mining. The performance of the proposed tree is better, even the size of incremental database is more than 50% of existing one, while it is not so in recently proposed algorithm. For evaluation purpose, the authors have used the benchmark weblog and found that the performance of proposed tree is encouraging compared to some of the recently proposed approaches.

[1]  Pradeep Kumar,et al.  A New Similarity Metric for Sequential Data , 2010, Int. J. Data Warehous. Min..

[2]  Yannis Manolopoulos,et al.  Finding Generalized Path Patterns for Web Log Data Mining , 2000, ADBIS-DASFAA.

[3]  Yi Liu,et al.  Fast incremental mining of web sequential patterns with PLWAP tree , 2009, Data Mining and Knowledge Discovery.

[4]  Ke Wang,et al.  Discovering Patterns from Large and Dynamic Sequential Data , 1997, Journal of Intelligent Information Systems.

[5]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[6]  Srinivasan Parthasarathy,et al.  Incremental and interactive sequence mining , 1999, CIKM '99.

[7]  M. Sulaiman Khan,et al.  Finding Associations in Composite Data Sets: The CFARM Algorithm , 2011, Int. J. Data Warehous. Min..

[8]  Weiru Chen,et al.  Graph-Based Modelling of Concurrent Sequential Patterns , 2010 .

[9]  Yi Lu,et al.  Mining Web Log Sequential Patterns with Position Coded Pre-Order Linked WAP-Tree , 2005, Data Mining and Knowledge Discovery.

[10]  Myra Spiliopoulou,et al.  Analysis of navigation behaviour in web sites integrating multiple information systems , 2000, The VLDB Journal.

[11]  Ahmet Arslan,et al.  Automatic discovery of the sequential accesses from web log data files via a genetic algorithm , 2006, Knowl. Based Syst..

[12]  David Taniar,et al.  Mining Association Rules in Data Warehouses , 2005, Int. J. Data Warehous. Min..

[13]  Min Chen,et al.  Incremental mining of Web sequential patterns using PLWAP tree on tolerance MinSupport , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[14]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[15]  Xiaohua Hu,et al.  Weak Ratio Rules: A Generalized Boolean Association Rules , 2011, Int. J. Data Warehous. Min..

[16]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[17]  Yi Lu,et al.  Position Coded Pre-order Linked WAP-Tree for Web Log Sequential Pattern Mining , 2003, PAKDD.

[18]  Gillian Dobbie,et al.  Automatic Item Weight Generation for Pattern Mining and its Application , 2011, Int. J. Data Warehous. Min..

[19]  Min Chen,et al.  Mining Web Sequential Patterns Incrementally with Revised PLWAP Tree , 2004, WAIM.

[20]  Kate A. Smith,et al.  Redundant association rules reduction techniques , 2005, Int. J. Bus. Intell. Data Min..

[21]  David Taniar,et al.  ODAM: An optimized distributed association rule mining algorithm , 2004, IEEE Distributed Systems Online.

[22]  David Taniar,et al.  High-Performance Parallel Database Processing and Grid Databases: Taniar/High-Performance Parallel DP & Grid DB , 2008 .

[23]  Iraklis Varlamis,et al.  Mining Frequent Generalized Patterns for Web Personalization in the Presence of Taxonomies , 2010, Int. J. Data Warehous. Min..

[24]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[25]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[26]  Yannis Manolopoulos,et al.  Mining patterns from graph traversals , 2001, Data Knowl. Eng..

[27]  Yi Liu,et al.  PLWAP sequential mining: open source code , 2005 .

[28]  Yue-Shi Lee,et al.  An incremental data mining algorithm for discovering web access patterns , 2006, Int. J. Bus. Intell. Data Min..

[29]  Florent Masseglia,et al.  An efficient algorithm for Web usage mining , 1999 .

[30]  Hua-Fu Li,et al.  A sliding window method for finding top-k path traversal patterns over streaming Web click-sequences , 2009, Expert Syst. Appl..

[31]  David Wai-Lok Cheung,et al.  Efficient Algorithms for Mining and Incremental Update of Maximal Frequent Sequences , 2005, Data Mining and Knowledge Discovery.

[32]  David Taniar,et al.  Exception Rules Mining Based on Negative Association Rules , 2004, ICCSA.

[33]  Anthony J. T. Lee,et al.  Mining Web navigation patterns with a path traversal graph , 2011, Expert Syst. Appl..

[34]  Maguelonne Teisseire,et al.  Incremental mining of sequential patterns in large databases , 2003, Data Knowl. Eng..

[35]  David Taniar,et al.  High Performance Parallel Database Processing and Grid Databases , 2008 .

[36]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[37]  R. S. Thakur,et al.  Maximal Pattern Mining Using Fast CP-Tree for Knowledge Discovery , 2012, Int. J. Inf. Syst. Soc. Chang..

[38]  David Taniar,et al.  Exception rules in association rule mining , 2008, Appl. Math. Comput..

[39]  Carolina Ruiz,et al.  FS-Miner: efficient and incremental mining of frequent sequence patterns in web logs , 2004, WIDM '04.