Segmentation of touching characters in printed document recognition

A new discrimination function for segmenting touching characters based on both pixel projection and profile projection is presented. A dynamic recursive segmentation algorithm is developed for effectively segmenting touching characters. Contextual information and a spelling checker are used to correct errors caused by incorrect recognition and segmentation. The 99.85% top recognition accuracy has been achieved, while the minimum accuracy is 99.4% based on 12 real documents.<<ETX>>

[1]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[2]  Nobuyasu Itoh,et al.  A spelling correction method and its application to an OCR system , 1990, Pattern Recognit..

[3]  Su Liang,et al.  Efficient algorithms for segmentation and recognition of printed characters in document processing , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  Norihiro Hagita,et al.  Automated entry system for printed documents , 1990, Pattern Recognit..

[6]  Fumitaka Kimura,et al.  Handwritten numerical recognition based on multiple algorithms , 1991, Pattern Recognit..

[7]  Haruo Asada,et al.  Resolving Ambiguity in Segmenting Touching Characters , 1992 .

[8]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Lindsay J. Evett,et al.  Fast dictionary look-up for contextual word recognition , 1990, Pattern Recognit..

[10]  Theodosios Pavlidis,et al.  On the Recognition of Printed Characters of Any Font and Size , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Chinmoy B. Bose,et al.  Connected and degraded text recognition using hidden Markov model , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[12]  P. G. De Luca,et al.  Printed character preclassification based on word structure , 1991, Pattern Recognit..

[13]  George Nagy Teaching a computer to read , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[14]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..