Abstract Classifier design and performance validation are important steps in the development of computer-aided diagnosis (CAD) systems. Within a CAD system, one or more classifiers may be used at various stages to differentiate malignant and benign lesions, or to differentiate true lesions from false positives. A classifier is trained with case samples drawn from the patient population. The performance of the trained classifier on unknown samples depends on the quality (whether the training samples are statistically representative of the patient population) and the quantity (sample size) of the training samples. To evaluate the performance of the classifier (or the CAD system), an independent set of test samples that have not been seen by the classifier (unknown samples) should be used. Because the available samples with ground truth are often limited in medical imaging research, the finite sample size is a limiting factor in the development of CAD systems. In this talk, we will review some of the issues associated with classifier design and validation under the constraint of finite sample size.
[1]
Trevor Hastie,et al.
The Elements of Statistical Learning
,
2001
.
[2]
Lubomir M. Hadjiiski,et al.
Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size.
,
2000,
Medical physics.
[3]
B. Efron.
The jackknife, the bootstrap, and other resampling plans
,
1987
.
[4]
Keinosuke Fukunaga,et al.
Effects of Sample Size in Classifier Design
,
1989,
IEEE Trans. Pattern Anal. Mach. Intell..
[5]
Keinosuke Fukunaga,et al.
Introduction to Statistical Pattern Recognition
,
1972
.
[6]
M. R. Chernick,et al.
Application of bootstrap and other resampling techniques: Evaluation of classifier performance
,
1985,
Pattern Recognit. Lett..
[7]
R. F. Wagner,et al.
Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers.
,
1999,
Medical physics.