Estimation of classification complexity

Classification problems vary on their level of complexity. Several methods have been proposed to calculate this level but it remains difficult to measure. Linearly separable classification problems are amongst the easiest problems to solve. There is a strong correlation between the degree of linear separability of a problem and its level of complexity. The more complex a problem is the more non-linear separable the data is. Here we propose a novel and simple method for quantifying, between 0 and 1, the complexity level of classification problems based on the degree of linear separability of the data set representing the problem. The method is based on the transformation of nonlinearly separable problems into linearly separable ones. Results obtained using several benchmarks are provided.

[1]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  G. Gates,et al.  The reduced nearest neighbor rule (Corresp.) , 1972, IEEE Trans. Inf. Theory.

[3]  Tin Kam Ho,et al.  Measuring the complexity of classification problems , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  David A. Elizondo,et al.  Growing Methods for Constructing Recursive Deterministic Perceptron Neural Networks and Knowledge Extraction , 1998, Artif. Intell..

[5]  Tacha Hicks,et al.  Forensic Interpretation of Glass Evidence , 2000 .

[6]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[8]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[9]  David A. Elizondo,et al.  The recursive deterministic perceptron neural network , 1998, Neural Networks.

[10]  Mohamed Tajine,et al.  Adapting the 2-class recursive deterministic perceptron neural network to m-classes , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[11]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[12]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[13]  Marcus Liwicki,et al.  Multiple Classifier Combination , 2008 .

[14]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[15]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[16]  G. Gates The Reduced Nearest Neighbor Rule , 1998 .

[17]  Tin Kam Ho,et al.  MULTIPLE CLASSIFIER COMBINATION: LESSONS AND NEXT STEPS , 2002 .

[18]  Tin Kam Ho,et al.  Complexity of Classification Problems and Comparative Advantages of Combined Classifiers , 2000, Multiple Classifier Systems.

[19]  Belur V. Dasarathy,et al.  Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[22]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[23]  David A. Elizondo,et al.  The linear separability problem: some testing methods , 2006, IEEE Transactions on Neural Networks.

[24]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[25]  Kevin Baker,et al.  Classification of radar returns from the ionosphere using neural networks , 1989 .

[26]  Léopold Simar,et al.  Computer Intensive Methods in Statistics , 1994 .