An Online Ensemble of Classifiers

Along with the explosive increase of data and information, incremental learning ability has become more and more important for machine learning approaches. The online algorithms try to forget irrelevant information instead of synthesizing all available information (as opposed to classic batch learning algorithms). Nowadays, combining classifiers is proposed as a new direction for the improvement of the classification accuracy. However, most ensemble algorithms operate in batch mode. For this reason, we propose an online ensemble of classifiers that combines an incremental version of Naive Bayes, the Voted Perceptron and the Winnow algorithms using the voting methodology. We performed a largescale comparison of the proposed ensemble with other state-of-the-art algorithms on several datasets and we took better accuracy in most cases.

[1]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[4]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[5]  Salvatore J. Stolfo,et al.  The application of AdaBoost for distributed, scalable and on-line learning , 1999, KDD '99.

[6]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[7]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[10]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[11]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[12]  Cullen Schaffer,et al.  Selecting a classification method by cross-validation , 1993, Machine Learning.

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[16]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[17]  David W. Aha,et al.  Lazy Learning , 1997, Springer Netherlands.

[18]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[19]  N. B. Skachkov,et al.  On the application of , 2002 .

[20]  David Gelperin,et al.  The optimality of A , 1988 .

[21]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[22]  Manfred K. Warmuth,et al.  Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.