Mining fault-tolerant frequent patterns efficiently with powerful pruning

The mining of frequent patterns in databases has been studied for several years. However, the real-world data tends to be dirty and frequent pattern mining which extracts patterns that are absolutely matched is not enough. An approach, called fault-tolerant frequent pattern (FT-pattern) mining, is more suitable for extracting interesting information from real-world data that may be polluted by noise. In our approach, the problems of mining proportional and fixed FT-patterns are considered. In proportional FT-pattern mining, the number of faults tolerable in a pattern is proportional to the length of the pattern. And the number of faults tolerable in different length of patterns is fixed in fixed FT-pattern mining. A new graph structure, FT-association graph, is proposed to help us filtering out impossible candidates with high efficiency. The experimental results show that the proposed algorithms of our approach are highly efficient for mining both proportional and fixed FT-patterns.

[1]  Geoffrey I. Webb,et al.  Mining Negative Rules Using GRD , 2004, PAKDD.

[2]  Mikolaj Morzy,et al.  Efficient Mining of Dissociation Rules , 2006, DaWaK.

[3]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[4]  Anthony K. H. Tung,et al.  Fault-Tolerant Frequent Pattern Mining: Problems and Challenges , 2001, DMKD.

[5]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[6]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[7]  WuXindong,et al.  Efficient mining of both positive and negative association rules , 2004 .

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[10]  Guanling Lee,et al.  A study on Proportional Fault-tolerant Data Mining , 2006, 2006 Innovations in Information Technology.

[11]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[12]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[13]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[15]  Jinyan Li,et al.  Mining Temporal Indirect Associations , 2006, PAKDD.

[16]  Osmar R. Zaïane,et al.  Mining Positive and Negative Association Rules: An Approach for Confined Rules , 2004, PKDD.

[17]  Ke Wang,et al.  Mining frequent item sets by opportunistic projection , 2002, KDD.

[18]  Cheng Yang,et al.  Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.