Discovering (frequent) constant conditional functional dependencies

Conditional functional dependencies (CFDs) have been recently introduced in the context of data cleaning. They can be seen as an unification of functional dependencies (FDs) and association rules (AR) since they allow to mix attributes and attribute/values in dependencies. In this paper, we introduce our first results on constant CFD inference. Not surprisingly, data mining techniques developed for functional dependencies and association rules can be reused for constant CFD mining. We focus on two types of techniques inherited from FD inference: the first one extends the notion of agree sets and the second one extends the notion of non-redundant sets, closure and quasi-closure. We have implemented the latter technique on which experiments have been carried out showing both the feasibility and the scalability of our proposition.

[1]  Nicolas Spyratos,et al.  Mining all frequent projection-selection queries from a relational table , 2008, EDBT '08.

[2]  Hannu Toivonen,et al.  Efficient discovery of functional and approximate dependencies using partitions , 1998, Proceedings 14th International Conference on Data Engineering.

[3]  Heikki Mannila,et al.  Algorithms for Inferring Functional Dependencies from Relations , 1994, Data Knowl. Eng..

[4]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[5]  János Demetrovics,et al.  Some Remarks On Generating Armstrong And Inferring Functional Dependencies Relation , 1995, Acta Cybern..

[6]  Hannu Toivonen,et al.  TANE: An Efficient Algorithm for Discovering Functional and Approximate Dependencies , 1999, Comput. J..

[7]  Heikki Mannila,et al.  Approximate Inference of Functional Dependencies from Relations , 1995, Theor. Comput. Sci..

[8]  Edward L. Robertson,et al.  FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances - Extended Abstract , 2001, DaWaK.

[9]  Bei Yu,et al.  On generating near-optimal tableaux for conditional functional dependencies , 2008, Proc. VLDB Endow..

[10]  Nicolas Spyratos The partition model: a deductive database model , 1987, TODS.

[11]  Nicolas Spyratos,et al.  Partition semantics for relations , 1985, PODS '85.

[12]  Wenfei Fan,et al.  Conditional functional dependencies for capturing data inconsistencies , 2008, TODS.

[13]  Rosine Cicchetti,et al.  Functional and embedded dependency inference: a data mining point of view , 2001, Inf. Syst..

[14]  Lhouari Nourine,et al.  A Unified Hierarchy for Functional Dependencies, Conditional Functional Dependencies and Association Rules , 2009, ICFCA.

[15]  Renée J. Miller,et al.  Discovering data quality rules , 2008, Proc. VLDB Endow..

[16]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[17]  Rosine Cicchetti,et al.  FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies , 2001, ICDT.

[18]  Richard Statman,et al.  On the Structure of Armstrong Relations for Functional Dependencies , 1984, JACM.

[19]  Jean-Marc Petit,et al.  The iZi Project: Easy Prototyping of Interesting Pattern Mining Algorithms , 2009, PAKDD Workshops.

[20]  Claude Berge,et al.  Graphs and Hypergraphs , 2021, Clustering.

[21]  Jean-Marc Petit,et al.  Efficient Discovery of Functional Dependencies and Armstrong Relations , 2000, EDBT.

[22]  Paul De Bra,et al.  Conditional Dependencies for Horizontal Decompositions , 1983, ICALP.

[23]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[24]  Wenfei Fan,et al.  Conditional Functional Dependencies for Data Cleaning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.