Symbolic Data Analysis: A New Tool in Data Mining

In the present paper we intend to underline some advantages of a conceptual interpretation of the data extracted from a database and of their modelling through Symbolic Objects (SO). Many (SDA) techniques have been recently developed (mainly, inside two European research projects: SODAS, ISO3D) to visualize, summarize and analyse complex data (Bock & Diday, 2000). SO are characterized by multi-valued variables, as well as by logical relations and taxonomical structures defined on their descriptors. They can be obtained by expert descriptions of natural concepts (e.g. families, enterprises, species of animals, etc.) or by typologies on classical data (e.g. group of customers, group of products or services, behaviours, etc.), or by queries on databases. The last two cases put in evidence some links between SDA and Knowledge Discovery (KD) process based on Data Mining (DM). In fact both of them aim to identify meaningful patterns, understandable relations and rules by extracting information from huge databases. This suggests their integrated use to offer new and powerful tools not only from a descriptive point of view but also from a decisional/ confirmatory one. It must be noticed that Symbolic Data Warehouses allow keeping the knowledge domain for the logical dependency and hierarchical rules, usually lost in a classical Data Warehouse. At the meantime SDA methods seem particularly useful to solve some typical problems of preprocessing in DM.