Extraction of Frequent Few-Overlapped Monotone DNF Formulas with Depth-First Pruning

In this paper, first we introduce frequent few-overlapped monotone DNF formulas under the minimum supportσ, the minimum term supportτ and the maximum overlapλ. We say that a monotone DNF formula is frequent if the support of it is greater than σ and the support of each term (or itemset) in it is greater than τ, and few-overlapped if the overlap of it is less than λ and λ < τ.Then, we design the algorithm ffo_dnf to extract them. The algorithm ffo_dnf first enumerates all of the maximal frequent itemsets under τ, and secondly connects the extracted itemsets by a disjunction ∨ until satisfying σ and λ. The first step of ffo_dnf, called a depth-first pruning, follows from the property that every pair of itemsets in a few-overlapped monotone DNF formula is incomparable under a subset relation. Furthermore, we show that the extracted formulas by ffo_dnf are representative.Finally, we apply the algorithm ffo_dnf to bacterial culture data.

[1]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[2]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[3]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[4]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[5]  Christophe Rigotti,et al.  A condensed representation to find frequent patterns , 2001, PODS '01.

[6]  Marzena Kryszkiewicz,et al.  Concise Representation of Frequent Patterns Based on Generalized Disjunction-Free Generators , 2002, PAKDD.

[7]  Marzena Kryszkiewicz Concise representation of frequent patterns based on disjunction-free generators , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[8]  Marzena Kryszkiewicz,et al.  Why to Apply Generalized Disjunction-Free Generators Representation of Frequent Patterns? , 2002, ISMIS.

[9]  Kouichi Hirata,et al.  Extraction of Coverings as Monotone DNF Formulas , 2003, Discovery Science.

[10]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[11]  Kouichi Hirata,et al.  Extracting Minimal and Closed Monotone DNF Formulas , 2004, Discovery Science.

[12]  Christophe Rigotti,et al.  DBC: a condensed representation of frequent patterns for efficient mining , 2003, Inf. Syst..

[13]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[14]  Malcolm P. Atkinson,et al.  Issues Raised by Three Years of Developing PJama: An Orthogonally Persistent Platform for Java , 1999, ICDT.