A New Methodology for Gray-Scale Character Segmentation and Recognition

Generally speaking, through the binarization of gray-scale images, useful information for the segmentation of touched or overlapped characters may be lost in many cases. If we analyze gray-scale images, however, specific topographic features and the variation of intensities can be observed in the character boundaries. In this paper, we propose a new methodology for character segmentation and recognition which makes the best use of the characteristics of gray-scale images. In the proposed methodology, the character segmentation regions are determined by using projection profiles and topographic features extracted from the gray-scale images. Then a nonlinear character segmentation path in each character segmentation region is found by using multi-stage graph search algorithm. Finally, in order to confirm the nonlinear character segmentation paths and recognition results, a recognition-based segmentation method is adopted. Through the experiments with various kinds of printed documents, it is convinced that the proposed methodology is very effective for the segmentation and recognition of touched and overlapped characters.

[1]  Haruo Asada,et al.  Major components of a complete text reading system , 1992 .

[2]  Theodosios Pavlidis,et al.  On the Recognition of Printed Characters of Any Font and Size , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Yi Lu,et al.  Machine printed character segmentation --; An overview , 1995, Pattern Recognit..

[4]  Seong-Whan Lee,et al.  Multi-lingual, multi-font and multi-size large-set character recognition using self-organizing neural network , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Ellis Horowitz,et al.  Fundamentals of Computer Algorithms , 1978 .

[6]  Young-Joon Kim,et al.  Direct Extraction of Topographic Features for Gray Scale Character Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Theodosios Pavlidis,et al.  A solution to the problem of touching and broken characters , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[8]  Majid Ahmadi,et al.  Segmentation of touching characters in printed document recognition , 1994, Pattern Recognit..

[9]  S. Ariyoshi A character segmentation method for Japanese printed documents coping with touching character problems , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[10]  Jin Wang,et al.  Segmentation of merged characters by neural networks and shortest path , 1994, Pattern Recognit..

[11]  Theodosios Pavlidis,et al.  Direct Gray-Scale Extraction of Features for Character Recognition , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Ulrich Kressel,et al.  Cut classification for segmentation , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).