Data mining is a useful decision support technique that can be used to discover production rules in warehouses or corporate data. Data mining research has made much effort to apply various mining algorithms efficiently on large databases. However, a serious problem in their practical application is the long processing time of such algorithms. Nowadays, one of the key challenges is to integrate data mining methods within the framework of traditional database systems. Indeed, such implementations can take advantage of the efficiency provided by SQL engines.In this paper, we propose an integrating approach for decision trees within a classical database system. In other words, we try to discover knowledge from relational databases, in the form of production rules, via a procedure embedding SQL queries. The obtained decision tree is defined by successive, related relational views. Each view corresponds to a given population in the underlying decision tree. We selected the classical Induction Decision Tree (ID3) algorithm to build the decision tree. To prove that our implementation of ID3 works properly, we successfully compared the output of our procedure with the output of an existing and validated data mining software, SIPINA. Furthermore, since our approach is tuneable, it can be generalized to any other similar decision tree-based method.
[1]
Sunita Sarawagi,et al.
Integrating association rule mining with relational database systems: alternatives and implications
,
1998,
SIGMOD '98.
[2]
Heikki Mannila,et al.
Fast Discovery of Association Rules
,
1996,
Advances in Knowledge Discovery and Data Mining.
[3]
J. Ross Quinlan,et al.
Induction of Decision Trees
,
1986,
Machine Learning.
[4]
Sunita Sarawagi,et al.
Integrating Mining with Relational Database Systems: Alternatives and Implications.
,
1998,
SIGMOD 1998.
[5]
JOHANNES GEHRKE,et al.
RainForest—A Framework for Fast Decision Tree Construction of Large Datasets
,
1998,
Data Mining and Knowledge Discovery.
[6]
AgrawalRakesh,et al.
Integrating association rule mining with relational database systems
,
1998
.
[7]
Surajit Chaudhuri,et al.
Integration of Data Mining and Relational Databases
,
2000
.
[8]
Giuseppe Psaila,et al.
A New SQL-like Operator for Mining Association Rules
,
1996,
VLDB.
[9]
Surajit Chaudhuri.
Data Mining and Database Systems: Where is the Intersection?
,
1998,
IEEE Data Eng. Bull..