RWFIM: Recent weighted-frequent itemsets mining

Abstract In recent years, weighted frequent itemsets mining (WFIM) has become a critical issue of data mining, which can be used to discover more useful and interesting patterns in real-world applications instead of the traditional frequent itemsets mining. Many algorithms have been developed to find weighted frequent itemsets (WFIs) without time-sensitive consideration. The discovered out-of-date information may, however, be meaningless and useless in decision making. In this paper, a novel framework, namely recent weighted-frequent itemsets mining (RWFIM) is proposed to concern both the weight and time-sensitive constraints. A projected-based RWFIM-P algorithm is first proposed for mining the designed recent weighted-frequent itemsets (RWFIs) with weight and time-sensitive consideration. It uses the projection-and-test mechanism to discover RWFIs in a recursive way. Based on the developed RWFIM-P algorithm, the entire database can be projected and divided into several sub-databases according to the currently processed itemset, thus reducing the computational costs and memory requirements. The second RWFIM-PE algorithm is also proposed to improve the performance of the first RWFIM-P algorithm based on the developed Estimated Weight of 2-itemset Pruning (EW2P) strategy to mine the RWFIs without generating the unpromising candidates, thus avoiding the computations of the projection mechanism compared to the first RWFIM-P algorithm. Experiments are conducted to evaluate the performance of the proposed two algorithms compared to the traditional WFIM in terms of execution time, number of generated RWFIs and scalability under varied two minimum thresholds in several real-world and synthetic datasets.

[1]  Philip S. Yu,et al.  Efficient mining of weighted association rules (WAR) , 2000, KDD '00.

[2]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[3]  Laks V. S. Lakshmanan,et al.  Constraint-Based Multidimensional Data Mining , 1999, Computer.

[4]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[5]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  John J. Leggett,et al.  WFIM: Weighted Frequent Itemset Mining with a weight range and a minimum weight , 2005, SDM.

[8]  Ada Wai-Chee Fu,et al.  Mining association rules with weighted items , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).

[9]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[10]  Engelbert Mephu Nguifo,et al.  Towards a semantic and statistical selection of association rules , 2013, ArXiv.

[11]  Keun Ho Ryu,et al.  Sliding window based weighted maximal frequent pattern mining over data streams , 2014, Expert Syst. Appl..

[12]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[13]  Engelbert Mephu Nguifo,et al.  Ranking and Selecting Association Rules Based on Dominance Relationship , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[14]  Ke Sun,et al.  Mining Weighted Association Rules without Preassigned Weights , 2008, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jian Pei,et al.  Constrained frequent pattern mining: a pattern-growth view , 2002, SKDD.

[16]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[17]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[18]  Tzung-Pei Hong,et al.  An effective mining approach for up-to-date patterns , 2009, Expert Syst. Appl..

[19]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[20]  Unil Yun,et al.  WSpan: Weighted Sequential pattern mining in large sequence databases , 2006, 2006 3rd International IEEE Conference Intelligent Systems.

[21]  Fionn Murtagh,et al.  Weighted Association Rule Mining using weighted support and significance framework , 2003, KDD '03.

[22]  Tzung-Pei Hong,et al.  An efficient approach for finding weighted sequential patterns from sequence databases , 2014, Applied Intelligence.

[23]  Engelbert Mephu Nguifo,et al.  Mining Undominated Association Rules Through Interestingness Measures , 2014, Int. J. Artif. Intell. Tools.

[24]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[25]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[26]  D. A. Bell,et al.  Mining Association Rules with Rough Sets , 2005, Intelligent Data Mining.

[27]  Frans Coenen,et al.  A new method for mining Frequent Weighted Itemsets based on WIT-trees , 2013, Expert Syst. Appl..