In this paper the Methodology of conceptual characterization by embedded conditioning CCEC, oriented to the automatic generation of conceptual descriptions of classifications that can support later decision-making is presented, as well as its application to the interpretation of previously identified classes characterizing the different situations on a WasteWater Treatment Plant (WWTP). The particularity of the method is that it provides an interpretation of a partition previously obtained on an ill-structured domain, starting from a hierarchical clustering. The methodology uses some statistical tools (as the boxplot multiple, introduced by Tukey, which in our context behave as a powerful tool for numeric variables) together with some machine learning methods, to learn the structure of the data; this allows extracting useful information (using the concept of characterizing variable) for the automatic generation of a set of useful rules for later identification of classes. In this paper the usefulness of CCEC for building domain theories as models supporting later decision-making is addressed.
[1]
Jan Komorowski,et al.
Principles of Data Mining and Knowledge Discovery
,
2001,
Lecture Notes in Computer Science.
[2]
George Tchobanoglous,et al.
Wastewater Engineering Treatment Disposal Reuse
,
1972
.
[3]
Karina Gibert,et al.
Revised Boxplot Based Discretization as the Kernel of Automatic Interpretation of Classes Using Numerical Variables
,
2006,
Data Science and Classification.
[4]
John W. Tukey,et al.
Exploratory Data Analysis.
,
1979
.
[5]
Ulises Cortés,et al.
Knowledge Discovery with Clustering Based on Rules. Interpreting Results
,
1998,
PKDD.
[6]
Karina Gibert i Oliveras.
The use of symbolic information in automation of statistical treatment for ill-structured domains
,
1996
.
[7]
Padhraic Smyth,et al.
From Data Mining to Knowledge Discovery: An Overview
,
1996,
Advances in Knowledge Discovery and Data Mining.
[8]
A. D. Gordon.
Identifying genuine clusters in a classification
,
1994
.