A system for reading USA census '90 hand-written fields

A specifically designed recognition system is presented for reading the Industry or Employer section of the Official 1990 U.S. Census form. This system was a participant in the 2nd Census OCR Systems Conference, organised by the U.S. Bureau of Census and the National Institute of Standards and Technology. It handles the complete reading task starting from the scanned raster image of the form arriving to an ASCII string containing what was written in the fields of the original form. The system is based on several building blocks, connected in a suitable way. In particular, the main operations of the recognition engine are: form identification, field isolation and bounding box removal, field and blob segmentation, broken character joining, isolated character recognition, word building, dictionary correction and finally, hypothesis and confidence generation. Each processing step is described an detail. Results on the NIST Special Database 13 are also reported.

[1]  Roberto Guerrieri,et al.  Massively-parallel handwritten character recognition based on the distance transform , 1995, Pattern Recognit..

[2]  Majid Ahmadi,et al.  Statistical and neural classification of handwritten numerals: a comparative study , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[3]  R. Casey Moment normalization of handprinted characters , 1970 .

[4]  Mohamed Cheriet,et al.  Background region-based algorithm for the segmentation of connected digits , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.