论文信息 - Overprvning Large Decision Trees

Overprvning Large Decision Trees

This paper presents empirical evidence for five hypotheses about learning from large noisy domains: that trees built from very large training sets are larger and more accurate than trees built from even large subsets; that this increased accuracy is only in part due to the extra size of the trees; and that the extra training instances allow both better choices of attribute while building the tree, and better choices of the subtrees to prune after it has been built. For the practitioner with the common goals of maximising the accuracy and minimising the size of induced trees, these conclusions prompt new techniques for induction on large training sets. Although building huge trees from huge training sets is computationally expensive, pruning smaller trees on them is not, yet it improves accuracy. Where a pruned tree is considered too large for human or machine limitations, it can be overpruncd to an acceptable size. Although this requires far more time than building a tree of that size from a correspondingly small training set, it wi l l usually be more accurate. The paper also describes an algorithm for overpruning trees to user-specified size limits; it is evaluated in the course of testing the above hypotheses.

Jason Catlett | J. Catlett

[1] Chris Carter,et al. Assessing Credit Card Applications Using Machine Learning , 1987, IEEE Expert.

[2] Brian P. Bailey,et al. Real-time detection of ischemic ECG changes using quasi-orthogonal leads and artificial intelligence , 1988, Proceedings. Computers in Cardiology 1988.

[3] Walter Van de Velde. Incremental Induction of Topologically Minimal Trees , 1990, ML.

[4] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[5] Jason Catlett,et al. Megainduction: A Test Flight , 1991, ML.

[6] Alen D. Shapiro,et al. Structured induction in expert systems , 1987 .

[7] S R Ray,et al. Computer sleep stage scoring--an expert system approach. , 1986, International journal of bio-medical computing.

[8] Paul Compton,et al. Inductive knowledge acquisition: a case study , 1987 .

[9] Larry A. Rendell. Comparing Systems and analyzing Functions to Improve Constructive Induction , 1989, ML.

[10] Donald Michie,et al. Current developments in expert systems , 1987 .

[11] Thomas G. Dietterich,et al. A Comparative Study of ID3 and Backpropagation for English Text-to-Speech Mapping , 1990, ML.