Estimation of Classifier Performance

An expression for expected classifier performance previously derived by the authors (ibid., vol.11, no.8, p.873-855, Aug. 1989) is applied to a variety of error estimation methods and a unified and comprehensive approach to the analysis of classifier performance is presented. After the error expression is introduced, it is applied to three cases: (1) a given classifier and a finite test set; (2) given test distributions a finite design set; and (3) finite and independent design and test sets. For all cases, the expected values and variances of the classifier errors are presented. Although the study of Case 1 does not produce any new results, it is important to confirm that the proposed approach produces the known results, and also to show how these results are modified when the design set becomes finite, as in Cases 2 and 3. The error expression is used to compute the bias between the leave-one-out and resubstitution errors for quadratic classifiers. The effect of outliers in design samples on the classification error is discussed. Finally, the theoretical analysis of the bootstrap method is presented for quadratic classifiers. >

[1]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[2]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[3]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[4]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[5]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[6]  Sarunas Raudys,et al.  On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Donald H. Foley Considerations of sample and feature size , 1972, IEEE Trans. Inf. Theory.

[8]  Geoffrey J. McLachlan,et al.  SOME EXPECTED VALUES FOR THE ERROR RATES OF THE SAMPLE QUADRATIC DISCRIMINANT FUNCTION1 , 1975 .

[9]  Keinosuke Fukunaga,et al.  Effects of Sample Size in Classifier Design , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[11]  Anil K. Jain,et al.  39 Dimensionality and sample size considerations in pattern recognition practice , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[12]  S. John Errors in Discrimination , 1961 .

[13]  C. Han,et al.  Distribution of discriminant function in circular models , 1970 .

[14]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.