Learning Probabilistic Models by Conceptual Pyramidal Clustering

Symbolic objects (Diday (1987, 1992), Brito, Diday (1990), Brito (1991)) allow to model data on the form of descriptions by intension, thus generalizing the usual tabular model of data analysis. This modelisation allows to take into account variability within a set. The formalism of symbolic objects has some notions in common with VL1, proposed by Michalski (1980); however VL1 is mainly based on prepositional and predicate calculus, while the formalism of symbolic objects allows for an explicit interpretation within its framework, by considering the duality intension-extension. That is, given a set of observations, we consider the couple (symbolic object — extension in the given set). This results from the wish to keep a statistics point of view. The need to represent non-deterministic knowledge, that is, data for which the values for the different variables are assigned a weight, led to considering an extension of assertion objects to probabilist objects (Diday 1992). In this case, data are represented by probability distributions on the variables observation sets. The notions previously defined for assertion objects are the generalized to this new kind of symbolic objects. Other extensions can be found in Diday (1992).