Generalized multi-scale stacked sequential learning for multi-class classification

In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches.

[1]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[2]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[4]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[5]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[6]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[7]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[10]  Vladimir Kolmogorov,et al.  Computing geodesics and minimal surfaces via graph cuts , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Vadim Mottl,et al.  Pattern recognition in interrelated data: the problem, fundamental assumptions, recognition algorithms , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[12]  Thomas G. Dietterich,et al.  Training conditional random fields via gradient tree boosting , 2004, ICML.

[13]  William W. Cohen,et al.  Stacked Sequential Learning , 2005, IJCAI.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Yann LeCun,et al.  Graph transformer networks for image recognition , 2005 .

[16]  Problem-dependent designs for Error Correcting Output Codes , 2006 .

[17]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[18]  Gareth Funka-Lea,et al.  Graph Cuts and Efficient N-D Image Segmentation , 2006, International Journal of Computer Vision.

[19]  Sergio Escalera,et al.  Subclass Problem-Dependent Design for Error-Correcting Output Codes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wolfgang Förstner,et al.  eTRIMS Image Database for Interpreting Images of Man-Made Scenes , 2009 .

[21]  Carlo Gatta,et al.  Multi-scale stacked sequential learning , 2009, Pattern Recognit..

[22]  Sergio Escalera,et al.  On the Decoding Process in Ternary Error-Correcting Output Codes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Petia Radeva,et al.  A Holistic Approach for the Detection of Media-Adventitia Border in IVUS , 2011, MICCAI.

[24]  Petia Radeva,et al.  Personalization and user verification in wearable systems using biometric walking patterns , 2011, Personal and Ubiquitous Computing.

[25]  Dong Wang,et al.  Learning machines: Rationale and application in ground-level ozone prediction , 2014, Appl. Soft Comput..