Demand-driven frequent itemset mining using pattern structures

Frequent itemset mining aims at discovering patterns the supports of which are beyond a given threshold. In many applications, including network event management systems, which motivated this work, patterns are composed of items each described by a subset of attributes of a relational table. As it involves an exponential mining space, the efficient implementation of user preferences and mining constraints becomes the first priority for a mining algorithm. User preferences and mining constraints are often expressed using patterns’ attribute structures. Unlike traditional methods that mine all frequent patterns indiscriminately, we regard frequent itemset mining as a two-step process: the mining of the pattern structures and the mining of patterns within each pattern structure. In this paper, we present a novel architecture that uses pattern structures to organize the mining space. In comparison with the previous techniques, the advantage of our approach is two-fold: (i) by exploiting the interrelationships among pattern structures, execution times for mining can be reduced significantly; and (ii) more importantly, it enables us to incorporate high-level simple user preferences and mining constraints into the mining process efficiently. These advantages are demonstrated by our experiments using both synthetic and real-life datasets.

[1]  Alfons Kemper,et al.  Bulletin of the Ieee Computer Society Technical Committee on Data Engineering , 1999 .

[2]  Joseph L. Hellerstein,et al.  Discovering actionable patterns in event data , 2002, IBM Syst. J..

[3]  R. Agrawal,et al.  Research Report Mining Sequential Patterns: Generalizations and Performance Improvements Limited Distribution Notice Mining Sequential Patterns: Generalizations and Performance Improvements , 1996 .

[4]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[5]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[6]  Philip S. Yu,et al.  Mining associations by pattern structure in large relational tables , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Surajit Chaudhuri Data Mining and Database Systems: Where is the Intersection? , 1998, IEEE Data Eng. Bull..

[9]  Joseph L. Hellerstein,et al.  FARM: a framework for exploring mining spaces with multiple attributes , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[11]  Jiawei Han,et al.  Meta-Rule-Guided Mining of Association Rules in Relational Databases , 1995, KDOOD/TDOOD.

[12]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[13]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[14]  Jiawei Han,et al.  Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[15]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[16]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[17]  Carlo Zaniolo,et al.  Metaqueries for Data Mining , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[19]  Laks V. S. Lakshmanan,et al.  On dual mining: from patterns to circumstances, and back , 2001, Proceedings 17th International Conference on Data Engineering.