Performance-friendly rule extraction in large water data-sets with AOC posets and relational concept analysis

In this paper, we consider data analysis methods for knowledge extraction from large water data-sets. More specifically, we try to connect physico-chemical parameters and the characteristics of taxons living in sample sites. Among these data analysis methods, we consider formal concept analysis (FCA), which is a recognized tool for classification and rule discovery on object–attribute data. Relational concept analysis (RCA) relies on FCA and deals with sets of object–attribute data provided with relations. RCA produces more informative results but at the expense of an increase in complexity. Besides, in numerous applications of FCA, the partially ordered set of concepts introducing attributes or objects (AOC poset, for Attribute–Object–Concept poset) is used rather than the concept lattice in order to reduce combinatorial problems. AOC posets are much smaller and easier to compute than concept lattices and still contain the information needed to rebuild the initial data. This paper introduces a variant of the RCA process based on AOC posets rather than concept lattices. This approach is compared with RCA based on iceberg lattices. Experiments are performed with various scaling operators, and a specific operator is introduced to deal with noisy data. We show that using AOC poset on water data-sets provides a reasonable concept number and allows us to extract meaningful implication rules (association rules whose confidence is 1), whose semantics depends on the chosen scaling operator.

[1]  Clémentine Nebut,et al.  Relational Concept Analysis for Relational Data Exploration , 2013, EGC.

[2]  Amedeo Napoli,et al.  Soundness and Completeness of Relational Concept Analysis , 2013, ICFCA.

[3]  Florence Le Ber,et al.  AOC-Posets: a Scalable Alternative to Concept Lattices for Relational Concept Analysis , 2013, CLA.

[4]  Martin Trnecka,et al.  An Algorithm for the Multi-Relational Boolean Factor Analysis based on Essential Elements , 2014, CLA.

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[7]  Rainer Osswald,et al.  Induction of Classifications from Linguistic Data , 2002 .

[8]  Karell Bertet,et al.  Extensions of Bordat's Algorithm for Attributes , 2007, CLA.

[9]  Florence Le Ber,et al.  RCA as a Data Transforming Method: A Comparison with Propositionalisation , 2014, ICFCA.

[10]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[11]  Karell Bertet,et al.  The multiple facets of the canonical direct unit implicational basis , 2010, Theor. Comput. Sci..

[12]  Karl Erich Wolff,et al.  Relational Scaling in Relational Semantic Systems , 2009, ICCS.

[13]  R. Osswald,et al.  A Logical Approach to Data-Driven Classification , 2003, KI.

[14]  Olivier Ridoux,et al.  Arbitrary Relations in Formal Concept Analysis and Logical Information Systems , 2005, ICCS.

[15]  Chouki Tibermacine,et al.  Selection of Composable Web Services Driven by User Requirements , 2011, 2011 IEEE International Conference on Web Services.

[16]  Clémentine Nebut,et al.  Building abstractions in class models: formal concept analysis in a model-driven approach , 2006, MoDELS'06.

[17]  Abdelhak-Djamel Seriai,et al.  Mining features from the object-oriented source code of software variants by combining lexical and structural similarity , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[18]  Clémentine Nebut,et al.  Generation of operational transformation rules from examples of model transformations , 2012, MODELS'12.

[19]  Amedeo Napoli,et al.  Formal Concept Analysis: A Unified Framework for Building and Refining Ontologies , 2008, EKAW.

[20]  Amedeo Napoli,et al.  Querying Relational Concept Lattices , 2011, CLA.

[21]  Hervé Leblanc,et al.  Galois lattice as a framework to specify building class hierarchies algorithms , 2000, RAIRO Theor. Informatics Appl..

[22]  Klaus Kabitzsch,et al.  Extraction of feature models from formal contexts , 2011, SPLC '11.

[23]  Hafedh Mili,et al.  Building and maintaining analysis-level class hierarchies using Galois Lattices , 1993, OOPSLA '93.

[24]  Martin Trnecka,et al.  Boolean Factor Analysis of Multi-Relational Data , 2013, CLA.

[25]  Florence Le Ber,et al.  Identifying Ecological Traits: A Concrete FCA-Based Approach , 2009, ICFCA.

[26]  Pascal Hitzler,et al.  Default Reasoning over Domains and Concept Hierarchies , 2004, KI.

[27]  Derrick G. Kourie,et al.  AddIntent: A New Incremental Algorithm for Constructing Concept Lattices , 2004, ICFCA.

[28]  Clémentine Nebut,et al.  Fixing Generalization Defects in UML Use Case Diagrams , 2012, CLA.

[29]  Franz Baader,et al.  A Finite Basis for the Set of EL-Implications Holding in a Finite Model , 2008, ICFCA.

[30]  Lian Shi,et al.  Mining for Reengineering: An Application to Semantic Wikis Using Formal and Relational Concept Analysis , 2011, ESWC.

[31]  Amedeo Napoli,et al.  Hermes: an Efficient Algorithm for Building Galois Sub-hierarchies , 2012, CLA.

[32]  Rokia Missaoui,et al.  Design of Class Hierarchies Based on Concept (Galois) Lattices , 1998, Theory Pract. Object Syst..

[33]  Roger Nkambou,et al.  Supporting Ontology Design through Large-Scale FCA-Based Ontology Restructuring , 2011, ICCS.

[34]  Rudolf Wille,et al.  The Lattice of Concept Graphs of a Relationally Scaled Context , 1999, ICCS.

[35]  Rudolf Wille,et al.  Conceptual Graphs and Formal Concept Analysis , 1997, ICCS.

[36]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[37]  Robert Rand,et al.  Ordered direct implicational basis of a finite closure system , 2011, ISAIM.

[38]  Wiebke Petersen,et al.  A Set-Theoretical Approach for the Induction of Inheritance Hierarchies , 2004, FGMOL.

[39]  Yann-Gaël Guéhéneuc,et al.  Refactorings of Design Defects Using Relational Concept Analysis , 2008, ICFCA.

[40]  Marianne Huchard,et al.  Improving Generalization Level in UML Models Iterative Cross Generalization in Practice , 2004, ICCS.

[41]  Marianne Huchard,et al.  Relational concept discovery in structured datasets , 2007, Annals of Mathematics and Artificial Intelligence.

[42]  Amedeo Napoli,et al.  Relational concept analysis: mining concept lattices from multi-relational data , 2013, Annals of Mathematics and Artificial Intelligence.

[43]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[44]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[45]  Jens Kötters,et al.  Concept Lattices of a Relational Structure , 2013, ICCS.

[46]  C. Braak,et al.  Matching species traits to environmental variables: a new three-table ordination method , 1996, Environmental and Ecological Statistics.