Maintaining discovered frequent itemsets: Cases for changeable database and support

Mining frequent itemsets from large databases has played an essential role in many data mining tasks. It is also important to maintain the discovered frequent itemsets for these data mining tasks when the database is updated. All algorithms proposed so far for the maintenance of discovered frequent itemsets are only performed with a fixed minimum support, which is the same as that used to obtain the discovered frequent itemsets. That is, users cannot change the minimum support even if the new results are unsatisfactory to the users. In this paper two new complementary algorithms, FMP (First Maintaining Process) and RMP (Repeated Maintaining Process), are proposed to maintain discovered frequent itemsets in the case that new transaction data are added to a transaction database. Both algorithms allow users to change the minimum support for the maintenance processes. FMP is used for the first maintaining process, and when the result derived from the FMP is unsatisfactory, RMP will be performed repeatedly until satisfactory results are obtained. The proposed algorithms re-use the previous results to cut down the cost of maintenance. Extensive experiments have been conducted to assess the performance of the algorithms. The experimental results show that the proposed algorithms are very resultful compared with the previous mining and maintenance algorithms for maintenance of discovered frequent itemsets.

[1]  Jiawei Han,et al.  Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[2]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[3]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[4]  D. Cheung,et al.  Maintenance of Discovered Association Rules: When to update? , 1997, DMKD.

[5]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[6]  Xiaoping Du,et al.  Two Fast Algorithms for Repeated Mining of Association Rules Based on Ressource Reuse , 1999, ICEIS.

[7]  Jiming Liu,et al.  Towards Efficient Data Re-mining (DRM) , 2001, PAKDD.

[8]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[9]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[10]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[11]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[12]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[14]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[15]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[16]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[17]  Laks V. S. Lakshmanan,et al.  Efficient mining of constrained correlated sets , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[18]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[19]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[20]  顕文 牧之内,et al.  Efficient Algorithms to Repeatedly Mine Large Itemsets with Different Minimum Supports , 2000 .

[21]  Ke Wang,et al.  Building Hierarchical Classifiers Using Class Proximity , 1999, VLDB.

[22]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[23]  David Wai-Lok Cheung,et al.  Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules , 1998, Data Mining and Knowledge Discovery.

[24]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[25]  David Wai-Lok Cheung,et al.  Maintenance of Discovered Knowledge: A Case in Multi-Level Association Rules , 1996, KDD.