An Efficient Mining Method for Incremental Updation in Large Databases

The database used in mining for knowledge discovery is dynamic in nature. Data may be updated and new transactions may be added over time. As a result, the knowledge discovered from such databases is also dynamic. Incremental mining techniques have been developed to speed up the knowledge discovery process by avoiding re-learning of rules from the old data. To maintain the large itemsets against the incoming dataset, we adopt the idea of negative border to help reduce the number of scans over the original database and discover new itemsets in the updated database. A lot of effort in the re-computation of negative border can be saved, and the minimal candidate set of large itemsets and negative border in the updated database can be obtained efficiently. Simulation results have shown that our method runs faster than other incremental mining techniques, especially when the large itemsets in the updated database are significantly different from those in the original database.

[1]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[2]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[3]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[4]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Ke Wang,et al.  Discovering Patterns from Large and Dynamic Sequential Data , 1997, Journal of Intelligent Information Systems.

[7]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[8]  R. Agrawal Mining sequencial patterns , 1995 .

[9]  Florent Masseglia,et al.  An efficient algorithm for Web usage mining , 1999 .

[10]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[11]  Sanjay Ranka,et al.  An Efficient Algorithm for the Incremental Updation of Association Rules in Large Databases , 1997, KDD.

[12]  John F. Roddick,et al.  Incremental Maintenance Techniques for Discovered Classification Rules , 1996, CODAS.

[13]  Vikram Pudi,et al.  Quantifying the Utility of the Past in Mining Large Databases , 2000, Inf. Syst..

[14]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases , 1998, Proceedings Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No.98CH36294).

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  Yang Dong MINING SEQUENTIAL PATTERNS IN WEB LOGS , 2000 .

[17]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[18]  Wei Li,et al.  Scalable data mining for rules , 1998 .

[19]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[20]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[21]  Srinivasan Parthasarathy,et al.  Incremental and interactive sequence mining , 1999, CIKM '99.

[22]  Nandlal L. Sarda,et al.  An adaptive algorithm for incremental mining of association rules , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[23]  Necip Fazıl Ayan,et al.  Updating large itemsets with early pruning , 1999 .

[24]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[25]  David Wai-Lok Cheung,et al.  Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules , 1998, Data Mining and Knowledge Discovery.

[26]  Yonatan Aumann,et al.  Efficient Algorithms for Discovering Frequent Sets in Incremental Databases , 1997, DMKD.

[27]  Ke Wang,et al.  Incremental Discovery of Sequential Patterns , 1996 .

[28]  Philippe Pucheral,et al.  Bitmap based algorithms for mining association rules , 1998, BDA.

[29]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.