Machine printed character segmentation --; An overview

Abstract This paper is part I of a two-part review series. We present here an overview of the character segmentation techniques in machine-printed documents. So far, as to this point, in most Optical Character Recognition (OCR) systems, either commercial products or systems described in the published literature, recognition algorithms are developed on isolated characters. Character segmentation is all too often ignored in the research community, yet broken and touching characters are responsible for the majority of errors in the automatic reading of both machine-printed and handwritten text. We will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniques based on recognition results.

[1]  Yi Lu On the segmentation of touching characters , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  A. Oosterlinck,et al.  Advances in Image Processing , 1987 .

[3]  Marilyn Bohl,et al.  Information processing , 1971 .

[4]  Yoshitake Tsuji,et al.  Character Image Segmentation , 1984, Optics & Photonics.

[5]  Simon Kahan,et al.  On the Recognition of PrntedCharacters ofAny Fontand Size , 1987 .

[6]  V. Kovalevsky Image Pattern Recognition , 1980, Springer New York.

[7]  Yoshitake Tsuji,et al.  Document Image Analysis For Reading Books , 1987, Other Conferences.

[8]  F C Billingsley,et al.  Applications of digital image processing. , 1970, Applied optics.

[9]  Noriyoshi Okamoto,et al.  A character segmentation algorithm for mixed-mode communication , 1985, Systems and Computers in Japan.

[10]  Ko Asai,et al.  Adaptive Character Segmentation Method Based on Minimum Variance Criterion , 1986, Systems and Computers in Japan.

[11]  Malayappan Shridhar,et al.  Context-directed segmentation algorithm for handwritten numeral strings , 1987, Image Vis. Comput..

[12]  Dave Elliman,et al.  A review of segmentation and contextual analysis techniques for text recognition , 1990, Pattern Recognit..

[13]  Haruo Asada,et al.  Resolving Ambiguity in Segmenting Touching Characters , 1992 .

[14]  Roy L. Hoffman,et al.  Segmentation Methods for Recognition of Machine-Printed Characters , 1971, IBM J. Res. Dev..

[15]  Osamu Nakamura,et al.  Character Segmentation for Mixed-Mode Communication , 1983, IFIP Congress.

[16]  Lawrence O'Gorman,et al.  Document Image Analysis , 1996 .