MEIT: Memory Efficient Itemset Tree for Targeted Association Rule Mining

The Itemset Tree is an efficient data structure for performing targeted queries for itemset mining and association rule mining. It is incrementally updatable by inserting new transactions and it provides efficient querying and updating algorithms. However, an important limitation of the IT structure, concerning scalability, is that it consumes a large amount of memory. In this paper, we address this limitation by proposing an improved data structure named MEIT Memory Efficient Itemset Tree. It offers an efficient node compression mechanism for reducing IT node size. It also performs on-the-fly node decompression for restoring compressed information when needed. An experimental study with datasets commonly used in the data mining literature representing various types of data shows that MEIT are up to 60 % smaller than IT 43% on average.

[1]  Vincent S. Tseng,et al.  Mining Top-K Non-redundant Association Rules , 2012, ISMIS.

[2]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[3]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[4]  Vincent S. Tseng,et al.  Mining Top-K Association Rules , 2012, Canadian Conference on AI.

[5]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[6]  Kamal Premaratne,et al.  Predicting Missing Items in Shopping Carts , 2009, IEEE Transactions on Knowledge and Data Engineering.

[7]  Vijay V. Raghavan,et al.  Min-Max Itemset Trees for Dense and Categorical Datasets , 2012, ISMIS.

[8]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Christie I. Ezeife,et al.  Mining Incremental Association Rules with Generalized FP-Tree , 2002, Canadian Conference on AI.

[12]  Troels Andreasen,et al.  Foundations of Intelligent Systems , 2014, Lecture Notes in Computer Science.

[13]  Vijay V. Raghavan,et al.  Itemset Trees for Targeted Association Querying , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Jian Pei,et al.  H-Mine: Fast and space-preserving frequent pattern mining in large databases , 2007 .