Introduce randomness into AdaBoost for robust performance on noisy data

AdaBoost is a well-known ensemble learning algorithm that generates weak classifiers sequentially and then combines them into a strong one. Also it shows its resistance to overfitting in low noise data cases , a lot of experiments[2,3,4] have shown that it is quite sensitive on noisy data. Several modifications to AdaBoost have been proposed to deal with noisy data. Bagging and Random forests have shown their resistance to noisy data which could be explained partially by their randomness. We study on introducing randomness into AdaBoost and propose a method called TRandom-AdaBoost which domains AdaBoost on both low and high noise dataset. We conjecture that the success of the proposed method could be explained by regarding it as a Random Forest with an ergodic sequence of weighted base hypotheses where the weights are generated by the technique used in AdaBoost.

[1]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[2]  Rocco A. Servedio,et al.  Boosting in the presence of noise , 2003, STOC '03.

[3]  Jian Li,et al.  Reducing the Overfitting of Adaboost by Controlling its Data Distribution Skewness , 2006, Int. J. Pattern Recognit. Artif. Intell..

[4]  Virginia Wheway,et al.  Using Boosting to Detect Noisy Data , 2000, PRICAI Workshops.

[5]  Nikunj C. Oza,et al.  AveBoost2: Boosting for Noisy Data , 2004, Multiple Classifier Systems.

[6]  Anneleen Van Assche,et al.  Ensemble Methods for Noise Elimination in Classification Problems , 2003, Multiple Classifier Systems.

[7]  Raymond J. Mooney,et al.  Experiments on Ensembles with Missing and Noisy Data , 2004, Multiple Classifier Systems.

[8]  Ana I. González Acuña An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and Randomization , 2012 .

[9]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[13]  Wenxin Jiang,et al.  Some Theoretical Aspects of Boosting in the Presence of Noisy Data , 2001, ICML.

[14]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.