论文信息 - An Empirical Comparison of Two Boosting Algorithms on Real Data Sets Based on Analysis of Scientific Materials

An Empirical Comparison of Two Boosting Algorithms on Real Data Sets Based on Analysis of Scientific Materials

Boosting algorithms are a means of building a strong ensemble classifier by aggregating a sequence of weak hypotheses. In this paper, multiple TAN classifiers generated by GTAN are combined by a combination method called Boosting-MultiTAN. This TAN combination classifier is compared with the Boosting-BAN classifier which is boosting based on BAN combination. We conduct an empirical study to compare the performance of two algorithms, measured in terms of overall test correct rate, on ten real data sets. Finally, experimental results show that the Boosting-BAN has higher classification accuracy on most data sets, but Boosting-MultiTAN has good effect on others. These results argue that boosting algorithms deserve more attention in machine learning and data mining communities.

Hongbo Zhou | Xiaowei Sun

[1] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2] Jie Cheng,et al. An Algorithm for Bayesian Belief Network Construction from Data , 2004 .

[3] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1995, COLT '90.

[4] J. Ross Quinlan,et al. Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[5] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.

[6] P. Schönemann. On artificial intelligence , 1985, Behavioral and Brain Sciences.

[7] Nir Friedman,et al. Bayesian Network Classifiers , 1997, Machine Learning.

[8] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[9] Shi Hong. Boosting-Based TAN Combination Classifier , 2004 .