Towards Improving Solution Dominance with Incomparability Conditions: A case-study using Generator Itemset Mining

Finding interesting patterns is a challenging task in data mining. Constraint based mining is a well-known approach to this, and one for which constraint programming has been shown to be a well-suited and generic framework. Dominance programming has been proposed as an extension that can capture an even wider class of constraint-based mining problems, by allowing to compare relations between patterns. In this paper, in addition to specifying a dominance relation, we introduce the ability to specify an incomparability condition. Using these two concepts we devise a generic framework that can do a batch-wise search that avoids checking incomparable solutions. We extend the ESSENCE language and underlying modelling pipeline to support this. We use generator itemset mining problem as a test case and give a declarative specification for that. We also present preliminary experimental results on this specific problem class with a CP solver backend to show that using the incomparability condition during search can improve the efficiency of dominance programming and reduces the need for post-processing to filter dominated solutions.

[1]  Jean-François Boulicaut,et al.  Mining free itemsets under constraints , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[2]  Ian P. Gent,et al.  Breaking Conditional Symmetry in Automated Constraint Modelling with CONJURE , 2014, ECAI.

[3]  Warwick Harvey,et al.  Essence: A constraint language for specifying combinatorial problems , 2007, Constraints.

[4]  Jean-François Boulicaut,et al.  Approximation of Frequency Queris by Means of Free-Sets , 2000, PKDD.

[5]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[6]  Peter J. Stuckey,et al.  MiniZinc: Towards a Standard CP Modelling Language , 2007, CP.

[7]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[8]  François Rioult,et al.  Efficiently Depth-First Minimal Pattern Mining , 2014, PAKDD.

[9]  Luc De Raedt,et al.  MiningZinc: A Modeling Language for Constraint-Based Mining , 2013, IJCAI.

[10]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[11]  Qiang Yang,et al.  Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.

[12]  Bilal Syed Hussain,et al.  Automated Symmetry Breaking and Model Selection in Conjure , 2013, CP.

[13]  Luc De Raedt,et al.  MiningZinc: A declarative framework for constraint-based mining , 2017, Artif. Intell..

[14]  Peter J. Stuckey,et al.  Solution Dominance over Constraint Satisfaction Problems , 2018, ArXiv.

[15]  Anton Dries,et al.  Dominance Programming for Itemset Mining , 2013, 2013 IEEE 13th International Conference on Data Mining.

[16]  Amedeo Napoli,et al.  Towards Rare Itemset Mining , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[17]  Marzena Kryszkiewicz,et al.  Representative Association Rules and Minimum Condition Maximum Consequence Association Rules , 1998, PKDD.

[18]  Dino Pedreschi,et al.  ExAMiner: optimized level-wise frequent pattern mining with monotone constraints , 2003, Third IEEE International Conference on Data Mining.

[19]  Luc De Raedt,et al.  Constraint programming for itemset mining , 2008, KDD.

[20]  Ian Miguel,et al.  Closed Frequent Itemset Mining with Arbitrary Side Constraints , 2018, 2018 IEEE International Conference on Data Mining Workshops (ICDMW).

[21]  F. Bonchi,et al.  Extending the state-of-the-art of constraint-based pattern discovery , 2007, Data Knowl. Eng..

[22]  Ian P. Gent,et al.  Automatically Improving Constraint Models in Savile Row through Associative-Commutative Common Subexpression Elimination , 2014, CP.

[23]  Brahim Hnich,et al.  Extensible Automated Constraint Modelling , 2011, AAAI.

[24]  Francesco Bonchi,et al.  On closed constrained frequent pattern mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[25]  Ian P. Gent,et al.  Automatically improving constraint models in Savile Row , 2017, Artif. Intell..

[26]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.