Bias Reduction in Split Variable Selection in C4.5
暂无分享,去创建一个
In this short communication we discuss the bias problem of C4.5 in split variable selection and suggest a method to reduce the variable selection bias among categorical predictor variables. A penalty proportional to the number of categories is applied to the splitting criterion gain of C4.5. The results of empirical comparisons show that the proposed modification of C4.5 reduces the size of classification trees.
[1] Hyunjoong Kim,et al. Classification Trees With Unbiased Multiway Splits , 2001 .
[2] Myung-Hoe Huh,et al. Input Variable Importance in Supervised Learning Models , 2003 .
[3] Yoon-Mo Lee,et al. A Study on Unbiased Methods in Constructing Classification Trees , 2002 .
[4] J. Ross Quinlan,et al. Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..