论文信息 - Mining Frequent Itemsets from Uncertain Data

Mining Frequent Itemsets from Uncertain Data

We study the problem of mining frequent itemsets from uncertain data under a probabilistic framework. We consider transactions whose items are associated with existential probabilities and give a formal definition of frequent patterns under such an uncertain data model. We show that traditional algorithms for mining frequent itemsets are either inapplicable or computationally inefficient under such a model. A data trimming framework is proposed to improve mining efficiency. Through extensive experiments, we show that the data trimming technique can achieve significant savings in both CPU cost and I/O cost.

[1] Alain Pirotte,et al. Imperfect Information in Relational Databases , 1996, Uncertainty Management in Information Systems.

[2] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[3] Sara J. Graves,et al. Using Association Rules as Texture Features , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Wynne Hsu,et al. Integrating Classification and Association Rule Mining , 1998, KDD.

[5] Chengyang Zhang,et al. Advances in Spatial and Temporal Databases , 2015, Lecture Notes in Computer Science.

[6] Yufei Tao,et al. Probabilistic Spatial Queries on Existentially Uncertain Data , 2005, SSTD.

[7] Ramakrishnan Srikant,et al. Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.