A Local Discretization of Continuous Data for Lattices: Technical Aspects

Since few years, Galois lattices (GLs) are used in data mining and defining a GL from complex data (i.e. non binary) is a recent chal- lenge (1,2). Indeed GL is classically defined from a binary table (called context), and therefore in the presence of continuous data a discretization step is generally needed to convert continuous data into discrete data. Discretization is classically performed before the GL construction in a global way. However, local discretization is reported to give better clas- sification rates than global discretization when used jointly with other symbolic classification methods such as decision trees (DTs). Using a re- sult of lattice theory bringing together set of objects and specific nodes of the lattice, we identify subsets of data to perform a local discretization for GLs. Experiments are performed to assess the eciency and the ef- fectiveness of the proposed algorithm compared to global discretization.

[1]  Karell Bertet,et al.  Some Links Between Decision Tree and Dichotomic Lattice , 2008, CLA 2008.

[2]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[3]  Engelbert Mephu Nguifo,et al.  A Comparative Study of FCA-Based Supervised Classification Algorithms , 2004, ICFCA.

[4]  Gerd Stumme,et al.  Conceptual Structures: Broadening the Base , 2001, Lecture Notes in Computer Science.

[5]  Amedeo Napoli,et al.  Mining gene expression data with pattern structures in formal concept analysis , 2011, Inf. Sci..

[6]  Karell Bertet,et al.  Navigala: an Original Symbol Classifier Based on Navigation through a Galois Lattice , 2011, Int. J. Pattern Recognit. Artif. Intell..

[7]  Fabrice Muhlenbach,et al.  Discretization of Continuous Attributes , 2005 .

[8]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[9]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[10]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[11]  Karell Bertet,et al.  Local Discretization of Numerical Data for Galois Lattices , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[12]  Colin Cooper,et al.  Encyclopedia of Data Warehousing and Mining , 2008 .

[13]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[14]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[15]  Bernhard Ganter,et al.  Pattern Structures and Their Projections , 2001, ICCS.

[16]  John Wang,et al.  Encyclopedia of Data Warehousing and Mining , 2005 .