A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors

Abstract: Using a number of measures for characterising the complexity of classification problems, we studied the comparative advantages of two methods for constructing decision forests – bootstrapping and random subspaces. We investigated a collection of 392 two-class problems from the UCI depository, and observed that there are strong correlations between the classifier accuracies and measures of length of class boundaries, thickness of the class manifolds, and nonlinearities of decision boundaries. We found characteristics of both difficult and easy cases where combination methods are no better than single classifiers. Also, we observed that the bootstrapping method is better when the training samples are sparse, and the subspace method is better when the classes are compact and the boundaries are smooth.

[1]  B. Chandrasekaran,et al.  On dimensionality and sample size in statistical pattern classification , 1971, Pattern Recognit..

[2]  Keinosuke Fukunaga,et al.  Estimation of Classification Error , 1970, IEEE Transactions on Computers.

[3]  Donald H. Foley Considerations of sample and feature size , 1972, IEEE Trans. Inf. Theory.

[4]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[5]  Jan M. Maciejowski,et al.  Model discrimination using an algorithmic information criterion , 1979, Autom..

[6]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[7]  Josef Kittler,et al.  Statistical Properties of Error Estimators in Performance Assessment of Recognition Systems , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Luc Devroye,et al.  Any Discrimination Rule Can Have an Arbitrarily Bad Probability of Error for Finite Sample Size , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[10]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Kishan G. Mehrotra,et al.  Bounds on the number of samples needed for neural learning , 1991, IEEE Trans. Neural Networks.

[12]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[13]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[14]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[16]  Sargur N. Srihari,et al.  A theory of classifier combination: the neural network approach , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[17]  R. Berlind An alternative method of stochastic discrimination with applications to pattern recognition , 1995 .

[18]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[19]  Robert P. W. Duin,et al.  On the nonlinearity of pattern classifiers , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[20]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  E. Kleinberg An overtraining-resistant stochastic modeling method for pattern recognition , 1996 .

[22]  Frank Lebourgeois,et al.  Pretopological approach for supervised learning , 1996, ICPR.

[23]  L. Frank,et al.  Pretopological approach for supervised learning , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[24]  Large-Scale Simulation Studies in Image Pattern Recognition , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[26]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  So Young Sohn,et al.  Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Tin Kam Ho,et al.  The learning behavior of single neuron classifiers on linearly separable or nonseparable input , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[29]  Tin Kam Ho,et al.  Measuring the complexity of classification problems , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[30]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.