Probability-based Incremental Association RulesDiscovery Algorithm with Hashing Technique

Discovery of association rule is one of the most interesting areas of research in data mining, which extracts together occurrence of itemset. In a dynamic database where the new transaction are inserted into the database, keeping patterns up-to-date and discovering new pattern are challenging problems of great practical importance. This may introduce new association rules and some existing association rules would become invalid. It is important to study efficient algorithms for incremental update of association rules in large databases. In this paper, we modify an existing incremental algorithm, Probability-based incremental association rule discovery. The previous algorithm, probability-based incremental association rule discovery algorithm uses principle of Bernoulli trials to find frequent and expected frequent k-itemsets. The set of frequent and expected frequent k-itemsets are determined from a candidate k-itemsets. Generating and testing the set of candidate is a time-consuming step in the algorithm. To reduce the number of candidates 2-itemset that need to repeatedly scan the database and check a large set of candidate, our paper is utilizing a hash technique for the generation of the candidate 2-itemset, especially for the frequent and expected frequent 2-itemsets, to improve the performance of probability-based algorithm. Thus, the algorithm can reduce not only a number of times to scan an original database but also the number of candidate itemsets to generate frequent and expected frequent 2 itemsets. As a result, the algorithm has execution time faster than the previous methods. This paper also conducts simulation experiments to show the performance of the proposed algorithm. The simulation results show that the proposed algorithm has a good performance.

[1]  R. Alhajj,et al.  Performance analysis of incremental update of association rules mining approaches , 2005, 2005 IEEE International Conference on Intelligent Engineering Systems, 2005. INES '05..

[2]  Necip Fazil Ayan,et al.  An efficient algorithm to update large itemsets with early pruning , 1999, KDD '99.

[3]  Worapoj Kreesuradej,et al.  False Positive Item set Algorithm for Incremental Association Rule Discovery , 2009 .

[4]  Yonatan Aumann,et al.  Borders: An Efficient Algorithm for Association Generation in Dynamic Databases , 1999, Journal of Intelligent Information Systems.

[5]  Worapoj Kreesuradej,et al.  Mining Dynamic Databases using Probability-Based Incremental Association Rule Discovery Algorithm , 2009, J. Univers. Comput. Sci..

[6]  R. Amornchewin,et al.  Probability-Based Incremental Association Rule Discovery Algorithm , 2008, International Symposium on Computer Science and its Applications.

[7]  Ming-Syan Chen,et al.  Sliding-window filtering: an efficient algorithm for incremental mining , 2001, CIKM '01.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[10]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[11]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[12]  Sanjay Ranka,et al.  An Efficient Algorithm for the Incremental Updation of Association Rules in Large Databases , 1997, KDD.

[13]  Chin-Chen Chang,et al.  An efficient algorithm for incremental mining of association rules , 2005, 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA'05).

[14]  Tzung-Pei Hong,et al.  A new incremental data mining algorithm using pre-large itemsets , 2001, Intell. Data Anal..

[15]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[16]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[17]  R. Amornchewin,et al.  Incremental association rule mining using promising frequent itemset algorithm , 2007, 2007 6th International Conference on Information, Communications & Signal Processing.