It is known that, in general, the number of measurements in a pattern classification problem cannot be increased arbitrarily, when the class-conditional densities are not completely known and only a finite number of learning samples are available. Above a certain number of measurements, the performance starts deteriorating instead of improving steadily. It was earlier shown by one of the authors that an exception to this "curse of finite sample size" is constituted by the case of binary independent measurements if a Bayesian approach is taken and uniform a priori on the unknown parameters are assumed. In this paper, the following generalizations are considered: arbitrary quantization and the use of maximum likelihood estimates. Further, the existence of an optimal quantization complexity is demonstrated, and its relationship to both the dimensionality of the measurement vector and the sample size are discussed. It is shown that the optimum number of quantization levels decreases with increasing dimensionality for a fixed sample size, and increases with the sample size for fixed dimensionality.
[1]
W. Gaffey.
Discriminatory Analysis - Perfect Discrimination as the Number of Variables Increases
,
1951
.
[2]
G. F. Hughes,et al.
On the mean accuracy of statistical pattern recognizers
,
1968,
IEEE Trans. Inf. Theory.
[3]
B. Chandrasekaran,et al.
Comments on "On the mean accuracy of statistical pattern recognizers" by Hughes, G. F
,
1969,
IEEE Trans. Inf. Theory.
[4]
B. Chandrasekaran,et al.
On dimensionality and sample size in statistical pattern classification
,
1971,
Pattern Recognit..
[5]
B. Chandrasekaran,et al.
Independence of measurements and the mean recognition accuracy
,
1971,
IEEE Trans. Inf. Theory.
[6]
B. Chandrasekaran,et al.
Quantization of Independent Measurements and Recognition Performance
,
1972
.