A Cascade Multiple Classifier System for Document Categorization

A novel cascade multiple classifier system (MCS) for document image classification is presented in the paper. It consists of two different classifiers with different feature sets. The proceeding classifier uses image features, learns physical representation of the document, and outputs a set of candidate class labels for the second classifier. The succeeding classifier is a hierarchical classification model based on textual features. The candidate labels set from the first classifier provides subtrees for the second classifier to search in the hierarchical tree and derive a final classification decision. Hence, it reduces the computational complexity and improves classification accuracy for the second classifier. We test the proposed cascade MCS on a large scale set of tax document classification. The experimental results show improvement of classification performance over individual classifiers.

[1]  Robi Polikar,et al.  An Ensemble Approach for Incremental Learning in Nonstationary Environments , 2007, MCS.

[2]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[3]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[4]  Prateek Sarkar Image classification: Classifying distributions of visual features , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Hinrich Schütze,et al.  Method combination for document filtering , 1996, SIGIR '96.

[6]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[8]  Kyosuke Nishida,et al.  Adaptive Classifiers-Ensemble System for Tracking Concept Drift , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[9]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Éric Trupin,et al.  Classification method study for automatic form class identification , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[11]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[12]  Dorothea Blostein,et al.  A survey of document image classification: problem statement, classifier architecture and performance evaluation , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[13]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part I: Detecting Nonstationary Changes , 2008, IEEE Transactions on Neural Networks.

[15]  Robi Polikar,et al.  Learning concept drift in nonstationary environments using an ensemble of classifiers based approach , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[16]  Ralf Klinkenberg,et al.  Boosting classifiers for drifting concepts , 2007, Intell. Data Anal..

[17]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[18]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[19]  Paulo Martins Engel,et al.  Dealing with non-stationary environments using context detection , 2006, ICML.

[20]  Robi Polikar,et al.  Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach , 2008, 2008 19th International Conference on Pattern Recognition.

[21]  Juan José Rodríguez Diez,et al.  Combining Online Classification Approaches for Changing Environments , 2008, SSPR/SPR.

[22]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Detecting Concept Change in Streaming Data: Overview and Perspectives , 2008 .

[23]  Jiri Matas,et al.  Combining evidence in personal identity verification systems , 1997, Pattern Recognit. Lett..

[24]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[25]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[26]  Venu Govindaraju,et al.  A Hierarchical Classification Model for Document Categorization , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[27]  Stephan Baumann,et al.  Advances in Document Classification by Voting of Competitive Approaches , 1996, DAS.

[28]  Azriel Rosenfeld,et al.  Classification of document pages using structure-based features , 2001, International Journal on Document Analysis and Recognition.

[29]  B. John Oommen,et al.  Stochastic learning-based weak estimation of multinomial random variables and its applications to pattern recognition in non-stationary environments , 2006, Pattern Recognit..

[30]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[31]  Yiming Yang,et al.  Combining Multiple Learning Strategies for Effective Cross Validation , 2000, ICML.

[32]  Susan T. Dumais,et al.  Probabilistic combination of text classifiers using reliability indicators: models and results , 2002, SIGIR '02.