论文信息 - Optimization of constrained frequent set queries with 2-variable constraints - 字舞流文

Optimization of constrained frequent set queries with 2-variable constraints

Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints. While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, “conventional” monotonicity-based optimization techniques do not apply effectively to 2-var constraints. The contributions are as follows. (1) We introduce a notion of quasi-succinctness, which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.

Laks V. S. Lakshmanan | Jiawei Han | Alex T. Pang | Raymond T. Ng | R. Ng | Jiawei Han | L. Lakshmanan | Alex T. Pang

[1] Renée J. Miller,et al. Association rules over interval data , 1997, SIGMOD '97.

[2] Yasuhiko Morimoto,et al. Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[3] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.

[4] Hannu Toivonen,et al. Sampling Large Databases for Association Rules , 1996, VLDB.

[5] T. J. Watson,et al. An E ective Hash-Based Algorithm for Mining Association RulesJong , 1995 .

[6] Heikki Mannila,et al. A database perspective on knowledge discovery , 1996, CACM.

[7] Roberto J. Bayardo,et al. Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[8] David B. Lomet,et al. Bulletin of the Technical Committee on Data Engineering Special Issue on Data Reduction Techniques Announcements and Notices Letter from the Editor-in-chief 1 Technical Committee Election Changing Editorial Staa Letter from the Special Issue Editor the New Jersey Data Reduction Report , 2022 .

[9] Christos Faloutsos,et al. Ratio Rules: A New Paradigm for Fast, Quantifiable Data Mining , 1998, VLDB.

[10] Laks V. S. Lakshmanan,et al. Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[11] Sunita Sarawagi,et al. Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[12] Jiawei Han,et al. Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[13] Philip S. Yu,et al. An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[14] Jiawei Han,et al. Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[15] Ramakrishnan Srikant,et al. Mining Association Rules with Item Constraints , 1997, KDD.

[16] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[17] Jiawei Han,et al. Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[18] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[19] Surajit Chaudhuri. Data Mining and Database Systems: Where is the Intersection? , 1998, IEEE Data Eng. Bull..

[20] Heikki Mannila,et al. Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[21] Rajeev Motwani,et al. Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[22] Sridhar Ramaswamy,et al. On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[23] Ramakrishnan Srikant,et al. Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[24] Abraham Silberschatz,et al. Database systems—breaking out of the box , 1997, SGMD.

[25] Chris Clifton,et al. Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[26] Ramakrishnan Srikant,et al. Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.