ID3 algorithm as a classical decision tree algorithm has been used broadly for its simple idea, facile realization, effectiveness and efficiency. Furthermore, lots of related algorithms have been proposed to improve ID3 on different aspects such as ID4, C4.5 and so on. In this paper, we propose the improved classification algorithm by minsup and minconf based on ID3 to decrease the data amount and reduce the impact of data with poor quality. This improved algorithm introduces two new concepts `support of test attribute set to class' and `rule confidence', which are used to improve the decision tree construction process by both prepruning and postpruning and ultimately to increase the efficiency and effectiveness of classification. Both theoretical analysis and test show that the improved algorithm avoids constructing a large decision tree with lots of branches which contains little information by reducing the size of data set during building process and pruning the useless rules from the built decision tree. It weakens the affect of poor quality data and produces a more appropriate decision tree finally
[1]
Richard O. Duda,et al.
Pattern classification and scene analysis
,
1974,
A Wiley-Interscience publication.
[2]
Christopher K. Riesbeck,et al.
Inside Case-Based Reasoning
,
1989
.
[3]
Lior Rokach,et al.
An Introduction to Decision Trees
,
2007
.
[4]
J. Ross Quinlan,et al.
Improved Use of Continuous Attributes in C4.5
,
1996,
J. Artif. Intell. Res..
[5]
J. Ross Quinlan,et al.
Learning Efficient Classification Procedures and Their Application to Chess End Games
,
1983
.
[6]
Douglas H. Fisher,et al.
A Case Study of Incremental Concept Induction
,
1986,
AAAI.
[7]
David E. Goldberg,et al.
Genetic Algorithms in Search Optimization and Machine Learning
,
1988
.