In recent years, mining frequent itemsets over uncertain data has attracted much attention in the data mining community. Unlike the corresponding problem in deterministic data, the frequent itemset under uncertain data has two different definitions: the expected support-based frequent itemset and the probabilistic frequent itemset. Most existing works only focus on one of the definitions and no comprehensive study is conducted to compare the two different definitions. Moreover, due to lacking the uniform implementation platform, existing solutions for the same definition even generate inconsistent results. In this demo, we present a demonstration called as UFIMT (underline Uncertain Frequent Itemset Mining Toolbox) which not only discovers frequent itemsets over uncertain data but also compares the performance of different algorithms and demonstrates the relationship between different definitions. In this demo, we firstly present important techniques and implementation skills of the mining problem, secondly, we show the system architecture of UFIMT, thirdly, we report an empirical analysis on extensive both real and synthetic benchmark data sets, which are used to compare different algorithms and to show the close relationship between two different frequent itemset definitions, and finally we discuss some existing challenges and new findings.
[1]
Philip S. Yu,et al.
Mining Frequent Itemsets over Uncertain Databases
,
2012,
Proc. VLDB Endow..
[2]
Carson Kai-Sang Leung,et al.
A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data
,
2008,
PAKDD.
[3]
Reynold Cheng,et al.
Mining uncertain data with probabilistic guarantees
,
2010,
KDD.
[4]
Charu C. Aggarwal,et al.
Frequent pattern mining with uncertain data
,
2009,
KDD.
[5]
Lei Chen,et al.
Discovering Threshold-based Frequent Closed Itemsets over Probabilistic Data
,
2012,
2012 IEEE 28th International Conference on Data Engineering.
[6]
Hans-Peter Kriegel,et al.
Probabilistic frequent itemset mining in uncertain databases
,
2009,
KDD.
[7]
Edward Hung,et al.
Mining Frequent Itemsets from Uncertain Data
,
2007,
PAKDD.
[8]
Toon Calders,et al.
Approximation of Frequentness Probability of Itemsets in Uncertain Data
,
2010,
2010 IEEE International Conference on Data Mining.
[9]
Reynold Cheng,et al.
Accelerating probabilistic frequent itemset mining: a model-based approach
,
2010,
CIKM.