Characterizing functional dependencies in formal concept analysis with pattern structures

Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.

[1]  Vincent Duquenne,et al.  Familles minimales d'implications informatives résultant d'un tableau de données binaires , 1986 .

[2]  José L. Balcázar,et al.  Characterization and Armstrong Relations for Degenerate Multivalued Dependencies Using Formal Concept Analysis , 2005, ICFCA.

[3]  G. Grätzer General Lattice Theory , 1978 .

[4]  Laks V. S. Lakshmanan,et al.  Discovering Conditional Functional Dependencies , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[5]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[6]  Bernhard Ganter,et al.  Pattern Structures and Their Projections , 2001, ICCS.

[7]  Lotfi Lakhal,et al.  The Agree Concept Lattice for Multidimensional Database Analysis , 2011, ICFCA.

[8]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[9]  Amedeo Napoli,et al.  Mining gene expression data with pattern structures in formal concept analysis , 2011, Inf. Sci..

[10]  Rokia Missaoui,et al.  Formal Concept Analysis for Knowledge Discovery and Data Mining: The New Challenges , 2004, ICFCA.

[11]  Lhouari Nourine,et al.  A Unified Hierarchy for Functional Dependencies, Conditional Functional Dependencies and Association Rules , 2009, ICFCA.

[12]  Michael L. Brodie,et al.  Relational Database Systems , 1983, Springer Berlin Heidelberg.

[13]  Hannu Toivonen,et al.  TANE: An Efficient Algorithm for Discovering Functional and Approximate Dependencies , 1999, Comput. J..

[14]  Bernard Monjardet,et al.  The Lattices of Closure Systems, Closure Operators, and Implicational Systems on a Finite Set: A Survey , 2003, Discret. Appl. Math..

[15]  Hiroki Arimura,et al.  LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[16]  Jaume Baixeries i Juvillà,et al.  Lattice characterization of armstrong and symmetric dependencies , 2007 .

[17]  Paris C. Kanellakis,et al.  Elements of Relational Database Theory , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[18]  J. Van Leeuwen,et al.  Handbook of theoretical computer science - Part A: Algorithms and complexity; Part B: Formal models and semantics , 1990 .

[19]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[20]  Lei Chen,et al.  Efficient discovery of similarity constraints for matching dependencies , 2013, Data Knowl. Eng..

[21]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[22]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[23]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[24]  Edward L. Robertson,et al.  FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances - Extended Abstract , 2001, DaWaK.

[25]  Dan A. Simovici,et al.  Impurity measures in databases , 2002, Acta Informatica.

[26]  Gerd Stumme,et al.  Formal Concept Analysis: foundations and applications , 2005 .

[27]  Rudolf Wille,et al.  Why can concept lattices support knowledge discovery in databases? , 2002, J. Exp. Theor. Artif. Intell..

[28]  Ronald Fagin,et al.  An Equivalence Between Relational Database Dependencies and a Fragment of Propositional Logic , 1981, JACM.

[29]  Lei Chen,et al.  Differential dependencies: Reasoning and discovery , 2011, TODS.

[30]  Brian A. Davey,et al.  An Introduction to Lattices and Order , 1989 .

[31]  Wenfei Fan,et al.  Conditional Functional Dependencies for Data Cleaning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[32]  Lhouari Nourine,et al.  Conditional Functional Dependencies: An FCA Point of View , 2010, ICFCA.

[33]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[34]  Marianne Baudinet,et al.  Constraint-Generating Dependencies , 1994, PPCP.

[35]  Gerd Stumme,et al.  Conceptual Structures: Broadening the Base , 2001, Lecture Notes in Computer Science.

[36]  Wenfei Fan,et al.  Dependencies revisited for improving data quality , 2008, PODS.

[37]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[38]  Richard Statman,et al.  On the Structure of Armstrong Relations for Functional Dependencies , 1984, JACM.

[39]  Sergei O. Kuznetsov,et al.  Computing premises of a minimal cover of functional dependencies is intractable , 2013, Discret. Appl. Math..

[40]  Philip S. Yu,et al.  Comparable dependencies over heterogeneous data , 2012, The VLDB Journal.

[41]  Jean-Marc Petit,et al.  Functional and approximate dependency mining: database and FCA points of view , 2002, J. Exp. Theor. Artif. Intell..

[42]  Sergei O. Kuznetsov,et al.  Machine Learning on the Basis of Formal Concept Analysis , 2001 .

[43]  Amedeo Napoli,et al.  Revisiting Numerical Pattern Mining with Formal Concept Analysis , 2011, IJCAI.

[44]  Jean-Marc Petit,et al.  Discovering (frequent) constant conditional functional dependencies , 2012, Int. J. Data Min. Model. Manag..

[45]  Vilém Vychodil,et al.  Data Tables with Similarity Relations: Functional Dependencies, Complete Rules and Non-redundant Bases , 2006, DASFAA.

[46]  Catriel Beeri,et al.  Formal Systems for Tuple and Equality Generating Dependencies , 1984, SIAM J. Comput..

[47]  Sergei O. Kuznetsov,et al.  Galois Connections in Data Analysis: Contributions from the Soviet Era and Modern Russian Research , 2005, Formal Concept Analysis.