Efficient Mining of Frequent Patterns from Uncertain Data

Since its introduction, mining of frequent patterns has been the subject of numerous studies. Generally, they focus on improving algorithmic efficiency for finding frequent patterns or on extending the notion of frequent patterns to other interesting patterns. Most of these studies find patterns from traditional transaction databases, in which the content of each transaction-namely, items-is definitely known and precise. However, there are many real-life situations in which ones are uncertain about the content of transactions. To deal with these situations, we propose a tree-based mining algorithm to efficiently find frequent patterns from uncertain data, where each item in the transactions is associated with an existential probability. Experimental results show the efficiency of our algorithm over its non-tree-based counterpart.

[1]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[2]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Osmar R. Zaïane,et al.  Incremental mining of frequent patterns without candidate generation or support constraint , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[4]  Laks V. S. Lakshmanan,et al.  Efficient dynamic mining of constrained frequent sets , 2003, TODS.

[5]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[6]  Carson Kai-Sang Leung,et al.  CanTree: a tree structure for efficient incremental mining of frequent patterns , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[7]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[10]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[11]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[12]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[13]  Heikki Mannila,et al.  OSSM: a segmentation approach to optimize frequency counting , 2002, Proceedings 18th International Conference on Data Engineering.

[14]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[15]  Laks V. S. Lakshmanan,et al.  Exploiting succinct constraints using FP-trees , 2002, SKDD.

[16]  Yufei Tao,et al.  Probabilistic Spatial Queries on Existentially Uncertain Data , 2005, SSTD.

[17]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[18]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[19]  Francesco Bonchi,et al.  On closed constrained frequent pattern mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[20]  Raymond Chi-Wing Wong,et al.  Mining top-K frequent itemsets from data streams , 2006, Data Mining and Knowledge Discovery.

[21]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[22]  Daniel Kifer,et al.  DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints , 2002, Data Mining and Knowledge Discovery.

[23]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.