Likely-Occurring Itemsets for Pattern Mining

We consider the itemset mining problem in general settings, e.g., mining association rules and itemset selection. We introduce the notion of likely-occurring itemsets and propose a greedy approach to itemset search space discovery that allows for reducing the number of arbitrary or closed itemsets. This method provides itemsets that are useful for different objectives and can be used as an additional constraint to curb the itemset explosion. In experiments, we show that the method is useful both for compression-based itemset mining and for computing good-quality association rules.

[1]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[2]  Hiroki Arimura,et al.  An Efficient Algorithm for Enumerating Closed Patterns in Transaction Databases , 2004, Discovery Science.

[3]  Daniel Paurat,et al.  Direct local pattern sampling by efficient two-step random procedures , 2011, KDD.

[4]  Geoffrey I. Webb Self-sufficient itemsets: An approach to screening potentially interesting associations between items , 2010, TKDD.

[5]  Amedeo Napoli,et al.  On Coupling FCA and MDL in Pattern Mining , 2019, ICFCA.

[6]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[7]  Thomas Gärtner,et al.  Linear space direct pattern sampling using coupling from the past , 2012, KDD.

[8]  Jilles Vreeken,et al.  Interesting Patterns , 2014, Frequent Pattern Mining.

[9]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Jilles Vreeken,et al.  Slim: Directly Mining Descriptive Patterns , 2012, SDM.

[12]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[13]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[14]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[15]  Takeaki Uno,et al.  Frequent Pattern Mining , 2016, Encyclopedia of Algorithms.

[16]  Jilles Vreeken,et al.  Krimp: mining itemsets that compress , 2011, Data Mining and Knowledge Discovery.

[17]  Tomasz Imielinski,et al.  ALPINE: Progressive Itemset Mining with Definite Guarantees , 2017, SDM.

[18]  Jilles Vreeken,et al.  Summarizing data succinctly with the most informative itemsets , 2012, TKDD.

[19]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[20]  Simon Andrews A Partial-Closure Canonicity Test to Increase the Efficiency of CbO-Type Algorithms , 2014, ICCS.