Gray-scale-image-based character recognition algorithm for low-quality and low-resolution images

Character recognition in low quality and low-resolution images is still a challenging problem. In this paper a gray-scale image based character recognition algorithm is proposed, which is specially suit to gray scale images captured from real world or very low quality character recognition. In our research, we classify the deformations of the low quality and low-resolution character images into two categories: (1) High spatial frequency deformations derived from either the blur distortion by the point spread function (PSF) of scanners or cameras, random noises, or character deformations; (b) Low spatial frequency deformations mainly derived from the large- scale background variations. The traditional recognition methods based on binary images cannot give satisfactory results in these images because these deformations will result in great amount of strokes touch or stroke broken in the binarization process. In the proposed method, we directly extract transform features on the gray-scale character images, which will avoid the shortcomings produced by binarization process. Our method differs from the existing gray-scale methods in that it avoids the difficult and unstable step of finding character structures in the images. By applying adequate feature selection algorithms, such as linear discriminant analysis (LDA) or principal component analysis (PCA), we can select the low frequency components that preserve the fundamental shape of characters and discard the high frequency deformation components. We also develop a gray- level histogram based algorithm using native integral ratio (NIR) technique to find a threshold to remove the backgrounds of character images while maintaining the details of the character strokes as much as possible. Experiments have shown that this method is especially effective for recognition of images of low quality and low-resolution.

[1]  Sargur N. Srihari,et al.  Gray-scale character recognition using boundary features , 1992, Electronic Imaging.

[2]  H. Kamada,et al.  High-speed, high-accuracy binarization method for recognizing text in images of low spatial resolutions , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[3]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[4]  Sankar K. Pal,et al.  A review on image segmentation techniques , 1993, Pattern Recognit..

[5]  Young-Joon Kim,et al.  Direct Extraction of Topographic Features for Gray Scale Character Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Changsong Liu,et al.  Multi-scale feature extraction and nested-subset classifier design for high accuracy handwritten character recognition , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[7]  Theodosios Pavlidis,et al.  Direct Gray-Scale Extraction of Features for Character Recognition , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[9]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Seong-Whan Lee,et al.  Nonlinear shape normalization methods for gray-scale handwritten character recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.