A Post-Pruning Decision Tree Algorithm Based on Bayesian

The C4.5 Algorithm can result in a thriving decision tree and will overfit the training data while training the model. In order to overcome those disadvantages, this paper proposed a post-pruning decision tree algorithm based on Bayesian theory, in which each branch of the decision tree generated by the C4.5 algorithm is validated by Bayesian theorem, and then those branches that do not meet the conditions will be removed from the decision tree, at last a simple decision tree will be generated. The proposed algorithm can be verified by the data provided by the Beijing key disciplines platform and the Beijing Master and Dr. Platform. The result shows that the algorithm can the most unreliable and uneven branches. And compared with the C4.5 algorithm, the proposed algorithm has a higher prediction accuracy and a broader coverage.

[1]  I. Bratko,et al.  Learning decision rules in noisy domains , 1987 .

[2]  Kweku-Muata Osei-Bryson,et al.  Post-pruning in decision tree induction using multiple performance measures , 2007, Comput. Oper. Res..

[3]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[4]  Xin Li,et al.  Classification of Vegetable Oils Based on Graphical Presentation and Bivariate Discriminant Node Model , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[5]  Chih-Jen Lin,et al.  Decomposition Methods for Linear Support Vector Machines , 2003, Neural Computation.

[6]  Ali Mirza Mahmood,et al.  A novel pruning approach using expert knowledge for data-specific pruning , 2011, Engineering with Computers.

[7]  Dimitrios Kalles,et al.  Efficient Incremental Induction of Decision Trees , 1996, Machine Learning.

[8]  Jin-Mao Wei,et al.  RST in Decision Tree Pruning , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[9]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.