Machine Learning and Formal Concept Analysis

A model of learning from positive and negative examples is naturally described in terms of Formal Concept Analysis (FCA). In these terms, result of learning consists of two sets of intents (closed subsets of attributes): the first one contains intents that have only positive examples in the corresponding extents. The second one contains intents such that the corresponding extents contain only negative examples. On the one hand, we show how the means of FCA allows one to realize learning in this model with various data representation, from standard object-attribute one to that with labeled graphs. On the other hand, we use the language of FCA to give natural descriptions of some standard models of Machine Learning such as version spaces and decision trees. This allows one to compare several machine learning approaches, as well as to employ some standard techniques of FCA in the domain of machine learning. Algorithmic issues of learning with concept lattices are discussed. We consider applications of the concept-based learning, including Structure-Activity Relationship problem (in predictive toxicology) and spam filtering.

[1]  Laurent Chaudron,et al.  Generalized Formal Concept Analysis , 2000, ICCS.

[2]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[3]  Bernhard Ganter,et al.  Pattern Structures and Their Projections , 2001, ICCS.

[4]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.

[5]  Sergei O. Kuznetsov,et al.  Toxicology Analysis by Means of the JSM-method , 2003, Bioinform..

[6]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[7]  Claudio Carpineto,et al.  A Lattice Conceptual Clustering System and Its Application to Browsing Retrieval , 1996, Machine Learning.

[8]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[9]  Jean-Gabriel Ganascia CHARADE: A Rule System Learning System , 1987, IJCAI.

[10]  Peter J. Braspenning,et al.  Version Space Learning with Instance-Based Boundary Sets , 1998, ECAI.

[11]  G. Deon Oosthuizen,et al.  Induction through Knowledge Base Normalisation , 1988, ECAI.

[12]  Haym Hirsh,et al.  Generalizing Version Spaces , 1994, Machine Learning.

[13]  Mehran Sahami Learning Classification Rules Using Lattices (Extended Abstract) , 1995, ECML.

[14]  Gerd Stumme,et al.  Conceptual Knowledge Discovery and Data Analysis , 2000, ICCS.

[15]  B. Ganter,et al.  Finding all closed sets: A general approach , 1991 .

[16]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[17]  Bernhard Ganter,et al.  Formalizing Hypotheses with Concepts , 2000, ICCS.

[18]  Olivier Ridoux,et al.  The Use of Associative Concepts in the Incremental Building of a Logical Context , 2002, ICCS.

[19]  Olivier Ridoux,et al.  A Logical Generalization of Formal Concept Analysis , 2000, ICCS.

[20]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[21]  Michèle Sebag,et al.  Relational Learning as Search in a Critical Region , 2003, J. Mach. Learn. Res..

[22]  Gerhard Deon Oosthuizen The use of a lattice in knowledge processing , 1988 .

[23]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[24]  Gerd Stumme,et al.  Conceptual Knowledge Discovery in Databases Using Formal Concept Analysis Methods , 1998, PKDD.

[25]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[26]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[27]  Franz Baader,et al.  Building and Structuring Description Logic Knowledge Bases Using Least Common Subsumers and Concept Analysis , 2000, ICCS.

[28]  Michèle Sebag Using Constraints to Building Version Spaces , 1994, ECML.

[29]  Leonard Pitt,et al.  Version Spaces without Boundary Sets , 1997, AAAI/IAAI.

[30]  Stefan Kramer,et al.  Structural Regression Trees , 1996, AAAI/IAAI, Vol. 1.

[31]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[32]  Luc De Raedt,et al.  Mining Association Rules in Multiple Relations , 1997, ILP.

[33]  Devika Subramanian,et al.  The Common Order-Theoretic Structure of Version Spaces and ATMSs , 1991, Artif. Intell..

[34]  Bernhard Ganter,et al.  Hypotheses and Version Spaces , 2003, ICCS.

[35]  Jean Sallantin,et al.  Structural Machine Learning with Galois Lattice and Graphs , 1998, ICML.

[36]  Michèle Sebag,et al.  Delaying the Choice of Bias: A Disjunctive Version Space Approach , 1996, ICML.

[37]  Peter A. Grigoriev,et al.  Elements of an Agile Discovery Environment , 2003, Discovery Science.

[38]  Sergei O. Kuznetsov,et al.  Learning of Simple Conceptual Graphs from Positive and Negative Examples , 1999, PKDD.

[39]  Luc De Raedt,et al.  Inductive Constraint Logic , 1995, ALT.

[40]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[41]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[42]  Ashwin Srinivasan,et al.  Warmr: a data mining tool for chemical data , 2001, J. Comput. Aided Mol. Des..

[43]  Sergei O. Kuznetsov On Computing the Size of a Lattice and Related Decision Problems , 2001, Order.