Min-Max Itemset Trees for Dense and Categorical Datasets

The itemset tree data structure is used in targeted association mining to find rules within a user's sphere of interest. In this paper, we propose two enhancements to the original unordered itemset trees. The first enhancement consists of sorting all nodes in lexical order based upon the itemsets they contain. In the second enhancement, called the Min-Max Itemset Tree, each node was augmented with minimum and maximum values that represent the range of itemsets contained in the children below. For demonstration purposes, we provide a comprehensive evaluation of the effects of the enhancements on the itemset tree querying process by performing experiments on sparse, dense, and categorical datasets.

[1]  Jian Pei,et al.  Constrained frequent pattern mining: a pattern-growth view , 2002, SKDD.

[2]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[3]  Vijay V. Raghavan,et al.  Itemset Trees for Targeted Association Querying , 2003, IEEE Trans. Knowl. Data Eng..

[4]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5]  Miroslav Kubat,et al.  Association Mining in Gradually Changing Domains , 2003, FLAIRS Conference.

[6]  Vijay V. Raghavan,et al.  The Item-Set Tree: A Data Structure for Data Mining , 1999, DaWaK.

[7]  Joost N. Kok,et al.  Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings , 2007, PKDD.

[8]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[9]  Ada Wai-Chee Fu,et al.  Mining frequent itemsets without support threshold: with and without item constraints , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[11]  Yu Li,et al.  Searching for high-support itemsets in itemset trees , 2006, Intell. Data Anal..

[12]  Colin Cooper,et al.  Realistic Synthetic Data for Testing Association Rule Mining Algorithms for Market Basket Databases , 2007, PKDD.

[13]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..