Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners

The effects of sample size on feature selection and error estimation for several types of classifiers are discussed. The focus is on the two-class problem. Classifier design in the context of small design sample size is explored. The estimation of error rates under small test sample size is given. Sample size effects in feature selection are discussed. Recommendations for the choice of learning and test sample sizes are given. In addition to surveying prior work in this area, an emphasis is placed on giving practical advice to designers and users of statistical pattern recognition systems. >

[1]  S. Geisser Posterior Odds for Multivariate Normal Classifications , 1964 .

[2]  Daniel G. Keehn,et al.  A note on learning for Gaussian properties , 1965, IEEE Trans. Inf. Theory.

[3]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[4]  T. Wagner,et al.  Asymptotically optimal discriminant functions for pattern classification , 1969, IEEE Trans. Inf. Theory.

[5]  B. Chandrasekaran,et al.  On dimensionality and sample size in statistical pattern classification , 1971, Pattern Recognit..

[6]  Donald H. Foley Considerations of sample and feature size , 1972, IEEE Trans. Inf. Theory.

[7]  Larry D. Hostetler,et al.  Optimization of k nearest neighbor density estimates , 1973, IEEE Trans. Inf. Theory.

[8]  Peter A. Lachenbruch,et al.  Robustness of the linear and quadratic discriminant function to certain types of non‐normality , 1973 .

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Laveen N. Kanal,et al.  Patterns in pattern recognition: 1968-1974 , 1974, IEEE Trans. Inf. Theory.

[11]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[12]  B. Efron The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[13]  G. McLachlan The bias of the apparent error rate in discriminant analysis , 1976 .

[14]  King-Sun Fu,et al.  Error estimation in pattern recognition via LAlpha -distance between posterior density functions , 1976, IEEE Trans. Inf. Theory.

[15]  B. G. Batchelor,et al.  Pattern recognition. Comparing techniques by competition , 1976 .

[16]  Anil K. Jain,et al.  On the optimal number of features in the classification of multivariate Gaussian data , 1978, Pattern Recognit..

[17]  Ned Glick,et al.  Additive estimators for probabilities of correct classification , 1978, Pattern Recognit..

[18]  Anil K. Jain,et al.  ON BALANCING DECISION FUNCTIONS. , 1979 .

[19]  Sarunas Raudys Determination of optimal dimensionality in statistical pattern classification , 1979, Pattern Recognit..

[20]  D. W. Roncek,et al.  Discrete Discriminant Analysis. , 1979 .

[21]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  G. McLachlan The efficiency of Efron's “Bootstrap” Approach Applied to Error Rate Estimation in Discriminant Analysis , 1980 .

[23]  Terence J. O'Neill The General Distribution of the Error Rate of a Classification Procedure With Application to Logistic Regression Discrimination , 1980 .

[24]  Sarunas Raudys,et al.  On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  J. Sayre,et al.  The Distributions of the Actual Error Rates in Linear Discriminant Analysis , 1980 .

[26]  R. A. Abusev,et al.  Unbiased Estimators and Classification Problems for Multivariate Normal Populations , 1981 .

[27]  Anil K. Jain,et al.  39 Dimensionality and sample size considerations in pattern recognition practice , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[28]  L. Devroye,et al.  8 Nearest neighbor methods in discrimination , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[29]  I. K. Sethi,et al.  Hierarchical Classifier Design Using Mutual Information , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  James D. Broffitt,et al.  6 Nonparametric classification , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[31]  G. S. Lbov 21 Logical functions in the problems of empirical prediction , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[32]  Minoru Siotani,et al.  3 Large sample approximations and asymptotic expansions of classification statistics , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[33]  Moshe Ben-Bassat,et al.  35 Use of distance measures, information measures and error bounds in feature evaluation , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[34]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[35]  G. McLachlan ASSESSING THE PERFORMANCE OF AN ALLOCATION RULE , 1986 .

[36]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  G. McLachlan Error Rate Estimation in Discriminant Analysis: Recent Advances , 1987 .

[38]  Anil K. Jain,et al.  Classifier design with Parzen Windows , 1988 .

[39]  Sarunas Raudys On the accuracy of a bootstrap estimate of the classification error , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[40]  Anil K. Jain,et al.  Small sample size effects in statistical pattern recognition: recommendations for practitioners and open problems , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.