Complexity Measures of Supervised Classification Problems

We studied a number of measures that characterize the difficulty of a classification problem, focusing on the geometrical complexity of the class boundary. We compared a set of real-world problems to random labelings of points and found that real problems contain structures in this measurement space that are significantly different from the random sets. Distributions of problems in this space show that there exist at least two independent factors affecting a problem's difficulty. We suggest using this space to describe a classifier's domain of competence. This can guide static and dynamic selection of classifiers for specific problems as well as subproblems formed by confinement, projection, and transformations of the feature vectors.

[1]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[2]  Gregory J. Chaitin,et al.  A recent technical report , 1974, SIGA.

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  Anil K. Jain,et al.  A Test to Determine the Multivariate Normality of a Data Set , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Tin Kam Ho,et al.  Measuring the complexity of classification problems , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[7]  Robert P. W. Duin,et al.  On the nonlinearity of pattern classifiers , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[8]  L. Frank,et al.  Pretopological approach for supervised learning , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[9]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[10]  FRED W. SMITH,et al.  Pattern Classifier Design by Linear Programming , 1968, IEEE Transactions on Computers.

[11]  Robert P. W. Duin,et al.  An Evaluation of Intrinsic Dimensionality Estimators , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Tin Kam Ho,et al.  The learning behavior of single neuron classifiers on linearly separable or nonseparable input , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[13]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[15]  G. P. King,et al.  Topological dimension and local coordinates from time series data , 1987 .

[16]  So Young Sohn,et al.  Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Murray Gell-Mann,et al.  What Is Complexity , 2002 .

[18]  Tin Kam Ho,et al.  Large-Scale Simulation Studies in Image Pattern Recognition , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Tin Kam Ho,et al.  Complexity of Classification Problems and Comparative Advantages of Combined Classifiers , 2000, Multiple Classifier Systems.

[21]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[22]  Frank Lebourgeois,et al.  Pretopological approach for supervised learning , 1996, ICPR.

[23]  Tin Kam Ho,et al.  MULTIPLE CLASSIFIER COMBINATION: LESSONS AND NEXT STEPS , 2002 .

[24]  R. Berlind An alternative method of stochastic discrimination with applications to pattern recognition , 1995 .

[25]  W. Grimson,et al.  Affine matching of planar sets , 1998 .

[26]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[27]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[28]  A. K. Jain,et al.  A critical evaluation of intrinsic dimensionality algorithms. , 1980 .

[29]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jan M. Maciejowski,et al.  Model discrimination using an algorithmic information criterion , 1979, Autom..

[31]  E. Kleinberg An overtraining-resistant stochastic modeling method for pattern recognition , 1996 .

[32]  Tin Kam Ho,et al.  Pattern Classification with Compact Distribution Maps , 1998, Comput. Vis. Image Underst..