A Study of Random Linear Oracle Ensembles

Random Linear Oracle (RLO) ensembles of Naive Bayes classifiers show excellent performance [12]. In this paper, we investigate the reasons for the success of RLO ensembles. Our study suggests that the decomposition of most of the classes of the dataset into two subclasses for each class is the reason for the success of the RLO method. Our study leads to the development of a new output manipulation based ensemble method; Random Subclasses (RS). In the proposed method, we create new subclasses from each subset of data points that belongs to the same class using RLO framework and consider each subclass as a class of its own. The comparative study suggests that RS is similar to RLO method, whereas RS is statistically better than or similar to Bagging and AdaBoost.M1 for most of the datasets. The similar performance of RLO and RS suggest that the creation of local structures (subclasses) is the main reason for the success of RLO. The another conclusion of this study is that RLO is more useful for classifiers (linear classifiers etc.) that have limited flexibility in their class boundaries. These classifiers can not learn complex class boundaries. Creating subclasses makes new, easier to learn, class boundaries.

[1]  Yves Lecourtier,et al.  A structural/statistical feature based vector for handwritten character recognition , 1998, Pattern Recognit. Lett..

[2]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[3]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[4]  Christoph F. Eick,et al.  Using Supervised Clustering to Enhance Classifiers , 2005, ISMIS.

[5]  Hendrik Blockeel,et al.  Machine Learning: ECML 2003 , 2003, Lecture Notes in Computer Science.

[6]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[7]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[8]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[9]  Clément Chatelain,et al.  A two-stage outlier rejection strategy for numerical field extraction in handwritten documents , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[10]  Malayappan Shridhar,et al.  A Lexicon Directed Algorithm for Recognition of Unconstrained Handwritten Words (Special Issue on Document Analysis and Recognition) , 1994 .

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[13]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[14]  Ricardo Vilalta,et al.  A Decomposition of Classes via Clustering to Explain and Improve Naive Bayes , 2003, ECML.

[15]  Juan José Rodríguez Diez,et al.  Classifier Ensembles with a Random Linear Oracle , 2007, IEEE Transactions on Knowledge and Data Engineering.

[16]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[17]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[18]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[19]  Juan José Rodríguez Diez,et al.  Naïve Bayes Ensembles with a Random Oracle , 2007, MCS.

[20]  Laurent Heutte,et al.  Using Random Forests for Handwritten Digit Recognition , 2007 .

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  Olivier Debeir,et al.  Limiting the Number of Trees in Random Forests , 2001, Multiple Classifier Systems.

[23]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[24]  Christoph F. Eick,et al.  Class decomposition via clustering: a new framework for low-variance classifiers , 2003, Third IEEE International Conference on Data Mining.

[25]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[26]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[29]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[30]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.