Benchmarking of update learning strategies on digit classifier systems

Three different strategies in order to re-train classifiers, when new labeled data become available, are presented in a multi-expert scenario. The first method is the use of the entire new dataset. The second one is related to the consideration that each single classifier is able to select new samples starting from those on which it performs a missclassification. Finally, by inspecting the multi expert system behavior, a sample misclassified by an expert, is used to update that classifier only if it produces a miss-classification by the ensemble of classifiers. This paper provides a comparison of three approaches under different conditions on two state of the art classifiers (SVM and Naive Bayes) by taking into account four different combination techniques. Experiments have been performed by considering the CEDAR (handwritten digit) database. It is shown how results depend by the amount of the new training samples, as well as by the specific combination decision schema and by classifiers in the ensemble.

[1]  R. Polikar,et al.  Bootstrap - Inspired Techniques in Computation Intelligence , 2007, IEEE Signal Processing Magazine.

[2]  Arun Ross,et al.  Biometric template selection and update: a case study in fingerprints , 2004, Pattern Recognit..

[3]  Giuseppe Pirlo,et al.  A Feedback-Based Multi-Classifier System , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[5]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Giuseppe Pirlo,et al.  Updating Knowledge in Feedback-Based Multi-classifier Systems , 2011, 2011 International Conference on Document Analysis and Recognition.

[7]  Cheng-Lin Liu,et al.  Handwritten digit recognition: benchmarking of state-of-the-art techniques , 2003, Pattern Recognit..

[8]  Giuseppe Pirlo,et al.  Artificial Classifier Generation for Multi-expert System Evaluation , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[9]  Volkmar Frinken,et al.  Evaluating Retraining Rules for Semi-Supervised Learning in Neural Network Based Cursive Word Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[10]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[11]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[13]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[14]  Giuseppe Pirlo,et al.  A multi-resolution multi-classifier system for speaker verification , 2012, Expert Syst. J. Knowl. Eng..

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[16]  Ching Y. Suen,et al.  Analysis of errors of handwritten digits made by a multitude of classifiers , 2005, Pattern Recognit. Lett..

[17]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[18]  Alicia Fornés,et al.  Co-training for Handwritten Word Recognition , 2011, 2011 International Conference on Document Analysis and Recognition.

[19]  Giuseppe Pirlo,et al.  Fuzzy-Zoning-Based Classification for Handwritten Characters , 2011, IEEE Transactions on Fuzzy Systems.