Constraint Programming for Mining Borders of Frequent Itemsets

Frequent itemset mining is one of the most studied tasks in knowledge discovery. It is often reduced to mining the positive border of frequent itemsets, i.e. maximal frequent itemsets. Infrequent itemset mining, on the other hand, can be reduced to mining the negative border, i.e. minimal infrequent itemsets. We propose a generic framework based on constraint programming to mine both borders of frequent itemsets. One can easily decide which border to mine by setting a simple parameter. For this, we introduce two new global constraints, FREQUENTSUBS and INFREQUENTSUPERS, with complete polynomial propagators. We then consider the problem of mining borders with additional constraints. We prove that this problem is coNP-hard, ruling out the hope for the existence of a single CSP solving this problem (unless coNP ⊆ NP).

[1]  Mohammed J. Zaki,et al.  Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[2]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[3]  Anton Dries,et al.  Dominance Programming for Itemset Mining , 2013, 2013 IEEE 13th International Conference on Data Mining.

[4]  Amedeo Napoli,et al.  Towards Rare Itemset Mining , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[5]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[6]  Jean-Marc Petit,et al.  Extending Set-based Dualization: Application to Pattern Mining , 2012, ECAI.

[7]  Christian Bessiere,et al.  A Global Constraint for Closed Frequent Pattern Mining , 2016, CP.

[8]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[9]  Amedeo Napoli,et al.  Efficient Vertical Mining of Minimal Rare Itemsets , 2012, CLA.

[10]  Christian Bessiere,et al.  Constraint Programming for Association Rules , 2019, SDM.

[11]  Patrice Boizumault,et al.  Constraint Programming for Mining n-ary Patterns , 2010, CP.

[12]  Luc De Raedt,et al.  Constraint programming for itemset mining , 2008, KDD.

[13]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[14]  Toby Walsh,et al.  Handbook of Constraint Programming , 2006, Handbook of Constraint Programming.

[15]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[16]  Christian Bessiere,et al.  Users Constraints in Itemset Mining , 2018, CP.

[17]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[18]  Lei Wu,et al.  Rare Itemset Mining , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[19]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[20]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[21]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[22]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[23]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[24]  Francesco Bonchi,et al.  On closed constrained frequent pattern mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[25]  Vladimir Gurvich,et al.  On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices , 2003, Annals of Mathematics and Artificial Intelligence.

[26]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[27]  Tias Guns,et al.  CoverSize: A Global Constraint for Frequency-Based Itemset Mining , 2017, CP.