Fast Mining Frequent Page Sets from Web Log by Filtering Strategy

Mining1 frequent web page sets from web log can help to understand the behaviors of users. In this paper, weight is assigned dynamically with the dwelling time of users on different pages to mine weighted frequent page sets (WFPS), and a filtering strategy is introduced to eliminate page sets which are impossible to be weighted frequent. An improved algorithm F&T using filtering strategy is proposed to reduce the quantity of candidate page sets and fast mine WFPS, which contains three steps: preprocessing, filtering and testing. In the filtering step of F&T, a filtering algorithm based on an improved Apriori (FIA) is presented to decrease the times of database scanning to speed up the filtering process. Then another filtering algorithm based on a WPS-tree (FWPS) is given to accelerate the computing of weights to further speed up the filtering process. The experimental results show that the two F&T algorithms (F&T_FIA and F&T_FWPS) are much faster than existing weighted web log mining algorithms. F&T_FWPS is faster than F&T_FIA, but F&T_FIA uses less memory space than F&T_FWPS.

[1]  Tzung-Pei Hong,et al.  Enhancing the Efficiency in Mining Weighted Frequent Itemsets , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[2]  Yuefeng Li,et al.  Mining Non-Redundant Association Rules Based on Concise Bases , 2007, Int. J. Pattern Recognit. Artif. Intell..

[3]  Frans Coenen,et al.  A new method for mining Frequent Weighted Itemsets based on WIT-trees , 2013, Expert Syst. Appl..

[4]  Gillian Dobbie,et al.  Weighted association rule mining via a graph based connectivity model , 2013, Inf. Sci..

[5]  B. Sathiyabhama,et al.  ENHANCED RECONFIGURABLE WEIGHTED ASSOCIATION RULE MINING FOR FREQUENT PATTERNS OF WEB LOGS , 2014 .

[6]  John J. Leggett,et al.  WLPMiner: Weighted Frequent Pattern Mining with Length-Decreasing Support Constraints , 2005, PAKDD.

[7]  Abha Choubey,et al.  Discovery of Frequent Patterns from Web Log Data by using FP-Growth algorithm for Web Usage Mining , 2012 .

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  P. Velvadivu,et al.  An Optimized Weighted Association Rule Mining On Dynamic Content , 2010, ArXiv.

[10]  Sireesha Rodda,et al.  Predicting user behavior through sessions using the web log mining , 2016, 2016 International Conference on Advances in Human Machine Interaction (HMI).

[11]  Theint Theint Aye,et al.  Web log cleaning for mining of web usage patterns , 2011, 2011 3rd International Conference on Computer Research and Development.

[12]  Heungmo Ryang,et al.  Top-k high utility pattern mining with effective threshold raising strategies , 2015, Knowl. Based Syst..

[13]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .

[14]  Ada Wai-Chee Fu,et al.  Mining association rules with weighted items , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).

[15]  Ke Sun,et al.  Mining Weighted Association Rules without Preassigned Weights , 2008, IEEE Transactions on Knowledge and Data Engineering.

[16]  V. S. Ananthanarayana,et al.  Discovery of weighted association rules mining , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[17]  Bay Vo,et al.  Mining Frequent Weighted Closed Itemsets , 2013, Advanced Computational Methods for Knowledge Engineering.

[18]  Fionn Murtagh,et al.  Weighted Association Rule Mining using weighted support and significance framework , 2003, KDD '03.

[19]  B. Sathiyabhama,et al.  Frequent pagesets from web log by enhanced weighted association rule mining , 2016, Cluster Computing.

[20]  Sourav S. Bhowmick,et al.  Association Rule Mining: A Survey , 2003 .

[21]  V. Chitraa,et al.  A Survey on Preprocessing Methods for Web Usage Data , 2010, ArXiv.

[22]  Witold Pedrycz,et al.  An efficient algorithm for mining frequent weighted itemsets using interval word segments , 2016, Applied Intelligence.

[23]  Ming-Yen Lin,et al.  High utility pattern mining using the maximal itemset property and lexicographic tree structures , 2012, Inf. Sci..

[24]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.