Constraint-Based Pattern Set Mining

Local pattern mining algorithms generate sets of patterns, which are typically not directly useful and have to be further processed before actual application or interpretation. Rather than investigating each pattern individually at the local level, we propose to mine for global models directly. A global model is essentially a pattern set that is interpreted as a disjunction of these patterns. It becomes possible to specify constraints at the level of the pattern sets of interest. This idea leads to the development of a constraint-based mining and inductive querying approach for global pattern mining. We introduce various natural types of constraints, discuss their properties, and show how they can be used for pattern set mining. A key contribution is that we show how wellknown properties from local pattern mining, such as monotonicity and anti-monotonicity, can be adapted for use in pattern set mining. This, in turn, then allows us to adapt existing algorithms for item-set mining to pattern set mining. Two algorithms are presented, one level-wise algorithm that mines for all pattern sets that satisfy a conjunction of a monotonic and an anti-monotonic constraint, and an algorithm that adds the capability of asking topk queries, We also report on a case study regarding classification rule selection using this new technique.

[1]  Shinichi Morishita,et al.  Transversing itemset lattices with statistical metric pruning , 2000, PODS '00.

[2]  Jean-François Boulicaut,et al.  A Survey on Condensed Representations for Frequent Sets , 2004, Constraint-Based Mining and Inductive Databases.

[3]  Dino Pedreschi,et al.  ExAMiner: optimized level-wise frequent pattern mining with monotone constraints , 2003, Third IEEE International Conference on Data Mining.

[4]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[5]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[6]  Luc De Raedt,et al.  A perspective on inductive databases , 2002, SKDD.

[7]  Laks V. S. Lakshmanan,et al.  Efficient dynamic mining of constrained frequent sets , 2003, TODS.

[8]  Luc De Raedt,et al.  A theory of inductive query answering , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[9]  瀬々 潤,et al.  Traversing Itemset Lattices with Statistical Metric Pruning (小特集 「発見科学」及び一般演題) , 2000 .

[10]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[11]  Kouichi Hirata,et al.  Extraction of Frequent Few-Overlapped Monotone DNF Formulas with Depth-First Pruning , 2005, PAKDD.

[12]  Luc De Raedt,et al.  Molecular feature mining in HIV data , 2001, KDD '01.

[13]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[14]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[15]  Laks V. S. Lakshmanan,et al.  Pushing Convertible Constraints in Frequent Itemset Mining , 2004, Data Mining and Knowledge Discovery.

[16]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.