Complexity of Classification Problems and Comparative Advantages of Combined Classifiers

We studied several measures of the complexity of classification problems and related them to the comparative advantages of two methods for creating multiple classifier systems. Using decision trees as prototypical classifiers and bootstrapping and subspace projection as classifier generation methods, we studied a collection of 437 two-class problems from public databases. We observed strong correlations between classifier accuracies, a measure of class boundary length, and a measure of class manifold thickness. Also, the bootstrapping method appears to be better when subsamples yield more variable boundary measures and the subspace method excels when many features contribute evenly to the discrimination.

[1]  E. Kleinberg An overtraining-resistant stochastic modeling method for pattern recognition , 1996 .

[2]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[3]  Tin Kam Ho,et al.  Large-Scale Simulation Studies in Image Pattern Recognition , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[5]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Sarunas Raudys,et al.  On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  L. Frank,et al.  Pretopological approach for supervised learning , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[9]  Josef Kittler,et al.  Statistical Properties of Error Estimators in Performance Assessment of Recognition Systems , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  So Young Sohn,et al.  Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[12]  Luc Devroye,et al.  Any Discrimination Rule Can Have an Arbitrarily Bad Probability of Error for Finite Sample Size , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Keinosuke Fukunaga,et al.  Estimation of Classification Error , 1970, IEEE Transactions on Computers.

[14]  W. Grimson,et al.  Affine matching of planar sets , 1998 .

[15]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[16]  Tin Kam Ho,et al.  The learning behavior of single neuron classifiers on linearly separable or nonseparable input , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[17]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  R. Berlind An alternative method of stochastic discrimination with applications to pattern recognition , 1995 .

[20]  Sarunas Raudys,et al.  On Dimensionality, Sample Size, and Classification Error of Nonparametric Linear Classification Algorithms , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Jan M. Maciejowski,et al.  Model discrimination using an algorithmic information criterion , 1979, Autom..