PUF-Tree: A Compact Tree Structure for Frequent Pattern Mining of Uncertain Data

Many existing algorithms mine frequent patterns from traditional databases of precise data. However, there are situations in which data are uncertain. In recent years, researchers have paid attention to frequent pattern mining from uncertain data. When handling uncertain data, UF-growth and UFP-growth are examples of well-known mining algorithms, which use the UF-tree and the UFP-tree respectively. However, these trees can be large, and thus degrade the mining performance. In this paper, we propose (i) a more compact tree structure to capture uncertain data and (ii) an algorithm for mining all frequent patterns from the tree. Experimental results show that (i) our tree is usually more compact than the UF-tree or UFP-tree, (ii) our tree can be as compact as the FP-tree, and (iii) our mining algorithm finds frequent patterns efficiently.

[1]  Carson Kai-Sang Leung,et al.  Mining of Frequent Itemsets from Streams of Uncertain Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[2]  Laks V. S. Lakshmanan,et al.  Efficient dynamic mining of constrained frequent sets , 2003, TODS.

[3]  Carson Kai-Sang Leung,et al.  Fast Tree-Based Mining of Frequent Itemsets from Uncertain Data , 2012, DASFAA.

[4]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[5]  Charu C. Aggarwal,et al.  Frequent pattern mining with uncertain data , 2009, KDD.

[6]  Carson Kai-Sang Leung,et al.  A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data , 2008, PAKDD.

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[8]  Toon Calders,et al.  Approximation of Frequentness Probability of Itemsets in Uncertain Data , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[10]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[11]  Toon Calders,et al.  Efficient Pattern Mining of Uncertain Data with Sampling , 2010, PAKDD.

[12]  Carson Kai-Sang Leung,et al.  Mining uncertain data , 2011, WIREs Data Mining Knowl. Discov..

[13]  Feifei Li,et al.  Finding frequent items in probabilistic data , 2008, SIGMOD Conference.

[14]  Philip S. Yu,et al.  Mining Frequent Itemsets over Uncertain Databases , 2012, Proc. VLDB Endow..

[15]  Carson Kai-Sang Leung,et al.  RadialViz: An Orientation-Free Frequent Pattern Visualizer , 2012, PAKDD.

[16]  Pourang Irani,et al.  FIsViz: A Frequent Itemset Visualizer , 2008, PAKDD.