Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes

We have previously introduced an incremental learning algorithm Learn++, which learns novel information from consecutive data sets by generating an ensemble of classifiers with each data set, and combining them by weighted majority voting. However, Learn++ suffers from an inherent ldquooutvotingrdquo problem when asked to learn a new class omeganew introduced by a subsequent data set, as earlier classifiers not trained on this class are guaranteed to misclassify omeganew instances. The collective votes of earlier classifiers, for an inevitably incorrect decision, then outweigh the votes of the new classifiers' correct decision on omeganew instances-until there are enough new classifiers to counteract the unfair outvoting. This forces Learn++ to generate an unnecessarily large number of classifiers. This paper describes Learn++ .NC, specifically designed for efficient incremental learning of multiple new classes using significantly fewer classifiers. To do so, Learn ++.NC introduces dynamically weighted consult and vote (DW-CAV) , a novel voting mechanism for combining classifiers: individual classifiers consult with each other to determine which ones are most qualified to classify a given instance, and decide how much weight, if any, each classifier's decision should carry. Experiments on real-world problems indicate that the new algorithm performs remarkably well with substantially fewer classifiers, not only as compared to its predecessor Learn++, but also as compared to several other algorithms recently proposed for similar problems.

[1]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[2]  Robi Polikar,et al.  An Ensemble Approach for Incremental Learning in Nonstationary Environments , 2007, MCS.

[3]  R. Polikar,et al.  Incremental learning from unbalanced data , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[5]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[6]  Marcus A. Maloof,et al.  Dynamic weighted majority: a new ensemble method for tracking concept drift , 2003, Third IEEE International Conference on Data Mining.

[7]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[8]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[9]  James R. Williamson,et al.  Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps , 1996, Neural Networks.

[10]  Nikhil R. Pal,et al.  A novel training scheme for multilayered perceptrons to realize proper generalization and incremental learning , 2003, IEEE Trans. Neural Networks.

[11]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[12]  Ryszard S. Michalski,et al.  Incremental learning with partial instance memory , 2002, Artif. Intell..

[13]  Aníbal R. Figueiras-Vidal,et al.  Class separability estimation and incremental learning using boundary methods , 2000, Neurocomputing.

[14]  Manfred K. Warmuth,et al.  Direct and indirect algorithms for on-line learning of disjunctions , 2002, Theor. Comput. Sci..

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16]  Zhi-Hua Zhou,et al.  Hybrid decision tree , 2002, Knowl. Based Syst..

[17]  Steffen Lange,et al.  On the power of incremental learning , 2002, Theor. Comput. Sci..

[18]  R. Polikar,et al.  Bootstrap - Inspired Techniques in Computation Intelligence , 2007, IEEE Signal Processing Magazine.

[19]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[20]  Naohiro Ishii,et al.  Incremental learning methods with retrieving of interfered patterns , 1999, IEEE Trans. Neural Networks.

[21]  Jianhua Chen,et al.  An incremental learning algorithm for constructing Boolean functions from positive and negative examples , 2002, Comput. Oper. Res..

[22]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[23]  P. Boland Majority Systems and the Condorcet Jury Theorem , 1989 .

[24]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[25]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[26]  Lalita Udpa,et al.  Artificial intelligence methods for selection of an optimized sensor array for identification of volatile organic compounds , 2001 .

[27]  R. Polikar,et al.  An incremental learning algorithm with confidence estimation for automated identification of NDE signals , 2004, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control.

[28]  Michael I. Jordan,et al.  Local linear perceptrons for classification , 1996, IEEE Trans. Neural Networks.

[29]  Robi Polikar,et al.  Learn++: a classifier independent incremental learning algorithm for supervised neural networks , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[30]  LiMin Fu Incremental knowledge acquisition in supervised learning networks , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[31]  Robi Polikar,et al.  Learn++.MT: A New Approach to Incremental Learning , 2004, Multiple Classifier Systems.

[32]  Arun Sharma,et al.  A Note on Batch and Incremental Learnability , 1998, J. Comput. Syst. Sci..

[33]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[34]  Tetsuya Hoya,et al.  On the capability of accommodating new classes within probabilistic neural networks , 2003, IEEE Trans. Neural Networks.

[35]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[37]  Jan Macek,et al.  Incremental learning of ensemble classifiers on ECG data , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[38]  L. Breiman Arcing Classifiers , 1998 .

[39]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[40]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[41]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[42]  Rocco A. Servedio,et al.  PAC Analogues of Perceptron and Winnow Via Boosting the Margin , 2000, Machine Learning.

[43]  C. Giraud-Carrier,et al.  A Constructive Incremental Learning Algorithm for Binary Classification Tasks , 2006, 2006 IEEE Mountain Workshop on Adaptive and Learning Systems.

[44]  Robi Polikar,et al.  Ensemble Confidence Estimates Posterior Probability , 2005, Multiple Classifier Systems.

[45]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[46]  Klaus P. Jantke Types of Incremental Learning , 2002 .

[47]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[48]  Nikola K. Kasabov,et al.  On-line learning, reasoning, rule extraction and aggregation in locally optimized evolving fuzzy neural networks , 2001, Neurocomputing.

[49]  Dragan Obradovic,et al.  On-line training of recurrent neural networks with continuous topology adaptation , 1996, IEEE Trans. Neural Networks.

[50]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[51]  CHEE PENG LIM,et al.  An Incremental Adaptive Network for On-line Supervised Learning and Probability Estimation , 1997, Neural Networks.

[52]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[53]  Sandra Zilles,et al.  Formal models of incremental learning and their analysis , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[54]  Christophe G. Giraud-Carrier,et al.  A Note on the Utility of Incremental Learning , 2000, AI Commun..

[55]  Jonathan Lee,et al.  A new ARTMAP-based neural network for incremental learning , 2006, Neurocomputing.

[56]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[57]  Phayung Meesad,et al.  Constructing a Fuzzy Rule-Based System Using the ILFN Network and Genetic Algorithm , 2001, Int. J. Neural Syst..

[58]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[59]  Fred Henrik Hamker,et al.  Life-long learning Cell Structures--continuously learning without catastrophic interference , 2001, Neural Networks.

[60]  Marimuthu Palaniswami,et al.  Incremental training of support vector machines , 2005, IEEE Transactions on Neural Networks.

[61]  Fabio Roli,et al.  A theoretical and experimental analysis of linear combiners for multiple classifier systems , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Dietmar Heinke,et al.  Comparing neural networks: a benchmark on growing neural gas, growing cell structures, and fuzzy ARTMAP , 1998, IEEE Trans. Neural Networks.

[63]  Peter Tino,et al.  IEEE Transactions on Neural Networks , 2009 .

[64]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[65]  Silvia Ferrari,et al.  A Constrained Optimization Approach to Preserving Prior Knowledge During Incremental Training , 2008, IEEE Transactions on Neural Networks.

[66]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[67]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[68]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[69]  José Carlos Príncipe,et al.  Incremental backpropagation learning networks , 1996, IEEE Trans. Neural Networks.

[70]  Patrick Gallinari,et al.  Online Handwritten Shape Recognition Using Segmental Hidden Markov Models , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  Robi Polikar,et al.  Learning concept drift in nonstationary environments using an ensemble of classifiers based approach , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[73]  R. Polikar,et al.  Dynamically weighted majority voting for incremental learning and comparison of three boosting based approaches , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[74]  Shaoning Pang,et al.  Incremental Learning of Chunk Data for Online Pattern Classification Systems , 2008, IEEE Transactions on Neural Networks.

[75]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[76]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[77]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .