Performance based Frequent Itemset Mining Techniques for Data Mining

 Abstract— Data mining tasks that try to find interesting patterns from databases, such as association rules, correlations, sequences, episodes, classifiers, clusters and many more of which the mining of association rules is one of the most popular problems. There is a large body of research on Frequent Itemset Mining (FIM) but very little work addresses FIM in uncertain databases. Most studies on frequent itemset mining focus on mining precise data. However, there are situations in which the data are uncertain. This leads to the mining of uncertain data. There are also situations in which users are only interested in frequent itemsets that satisfy user-specified aggregate constraints. This leads to constrained mining of uncertain data. Moreover, floods of uncertain data can be produced in many other situations. This leads to stream mining of uncertain data. In this paper, we propose algorithms to deal with all these situations. We first design a tree-based mining algorithm to find all frequent itemsets from databases of uncertain data. We then extend it to mine databases of uncertain data for only those frequent itemsets that satisfy user-specified aggregate constraints and to mine streams of uncertain data for all frequent itemsets. Our experimental results show the more effectiveness than existing methods.

[1]  Nan Jiang,et al.  Research issues in data stream association rule mining , 2006, SGMD.

[2]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[3]  Charu C. Aggarwal,et al.  Frequent pattern mining with uncertain data , 2009, KDD.

[4]  Reynold Cheng,et al.  Uncertain Data Mining: An Example in Clustering Location Data , 2006, PAKDD.

[5]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[6]  Ben Kao,et al.  A Decremental Approach for Mining Frequent Itemsets from Uncertain Data , 2008, PAKDD.

[7]  L. Manikonda,et al.  UACI: Uncertain associative classifier for object class identification in images , 2010, 2010 25th International Conference of Image and Vision Computing New Zealand.

[8]  Arbee L. P. Chen,et al.  Efficient frequent sequence mining by a dynamic strategy switching algorithm , 2008, The VLDB Journal.

[9]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[10]  Carson Kai-Sang Leung,et al.  Efficient algorithms for the mining of constrained frequent patterns from uncertain data , 2010, SKDD.

[11]  B. B. Yaghlane,et al.  A New Algorithm for Mining Frequent Itemsets from Evidential Databases , 2008 .

[12]  Yuanyuan Zhou,et al.  Mining block correlations to improve storage performance , 2005, TOS.

[13]  W. Marsden I and J , 2012 .

[14]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[15]  Charu C. Aggarwal,et al.  On Density Based Transforms for Uncertain Data Mining , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Hans-Peter Kriegel,et al.  Density-based clustering of uncertain data , 2005, KDD '05.