Large-Scale Simulation Studies in Image Pattern Recognition

Many obstacles to progress in image pattern recognition result from the fact that per-class distributions are often too irregular to be well-approximated by simple analytical functions. Simulation studies offer one way to circumvent these obstacles. We present three closely related studies of machine-printed character recognition that rely on synthetic data generated pseudo-randomly in accordance with an explicit stochastic model of document image degradations. The unusually large scale of experiments - involving several million samples that makes this methodology possible have allowed us to compute sharp estimates of the intrinsic difficulty (Bayes risk) of concrete image recognition problems, as well as the asymptotic accuracy and domain of competency of classifiers.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Godfried T. Toussaint,et al.  Bibliography on estimation of misclassification , 1974, IEEE Trans. Inf. Theory.

[4]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[5]  Keinosuke Fukunaga,et al.  Bias of Nearest Neighbor Error Estimates , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  T. Ho A theory of multiple classifier systems and its application to visual word recognition , 1992 .

[7]  Tin Kam Ho,et al.  Perfect metrics , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[8]  Henry S. Baird,et al.  Document image defect models and their uses , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[9]  Tin Kam Ho,et al.  Estimating the intrinsic difficulty of a recognition problem , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[10]  Henry S. Baird,et al.  Asymptotic accuracy of two-class discrimination , 1994 .

[11]  Henry S. Baird,et al.  Document image defect models , 1995 .

[12]  W. Grimson,et al.  Affine matching of planar sets , 1998 .