of Pattern Recognition and Artificial Intelligence Signature Segmentation from Machine Printed Documents using Contextual Information

Automatic signature segmentation from a printed document is a challenging task due to the nature of handwriting of the signatory, overlapping/touching of signature strokes with printed text, graphics, noise, etc. In this paper, we propose a two-stage approach for signature segmentation from a document page. In the first stage, a document is segmented into blocks and then blocks are classified into two classes: signature block and printed word block. Gradient-based features are used for block feature extraction and support vector machine classifier is used for block-wise classification. In the second stage, printed characters that may be present in isolated form or overlapped/touched with signature part are removed from signature blocks. From each of the detected signature blocks, the isolated printed characters (if exist) are removed using context information. To detect overlapping/touching printed stroke in a signature block, at first some hypothetical zones are detected where possible overlapping/touching may occur. Bounding box information of neighboring printed word block and local linearity of character strings near the signature blocks are used to detect hypothetical zones. Next, to detect the overlapping/touching printed strokes in hypothetical zones of a signature block, the corner points of contours obtained by Douglas and Peucker polygonal approximation algorithm and skeleton junction points are used. Finally, the touching strokes of signature are separated from text characters using the contour smoothness information near skeleton junction points. The experiment is performed in "Tobacco-800" dataset [The legacy tobacco document library (ltdl), available at http://legacy.library.ucsf.edu/, University of California, San Francisco, 2007.] and the results obtained from the experiment are promising.

[1]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[2]  Bidyut Baran Chaudhuri,et al.  A System for Joining and Recognition of Broken Bangla Numerals for Indian Postal Automation , 2004, ICVGIP.

[3]  Tetsushi Wakabayashi,et al.  Handwritten Numeral Recognition of Six Popular Indian Scripts , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[4]  Alireza Khotanzad,et al.  Invariant Image Recognition by Zernike Moments , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David S. Doermann,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Signature Detection and Matching , 2022 .

[6]  David S. Doermann,et al.  A robust stamp detection framework on degraded documents , 2006, Electronic Imaging.

[7]  Jinhong Katherine Guo,et al.  Separating handwritten material from machine printed text using hidden Markov models , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[8]  David S. Doermann,et al.  The Segmentation and Identification of Handwriting in Noisy Document Images , 2002, Document Analysis Systems.

[9]  Venu Govindaraju,et al.  Markov Random Field Based Text Identification from Annotated Machine Printed Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[10]  Meng Shi,et al.  Handwritten numeral recognition using gradient and curvature of gray scale image , 2002, Pattern Recognit..

[11]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[12]  Venu Govindaraju,et al.  Text Separation from Mixed Documents Using a Tree-Structured Classifier , 2010, 2010 20th International Conference on Pattern Recognition.

[13]  Bidyut Baran Chaudhuri,et al.  Automatic separation of machine-printed and hand-written text lines , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[14]  Jacques P. Swanepoel,et al.  Off-line Signature Verification Using Flexible Grid Features and Classifier Fusion , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[15]  David S. Doermann,et al.  Logo Matching for Document Image Retrieval , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[16]  Venu Govindaraju,et al.  Overlapped text segmentation using Markov random field and aggregation , 2010, DAS '10.

[17]  Jesús Francisco Vargas-Bonilla,et al.  The 4NSigComp2010 Off-line Signature Verification Competition: Scenario 2 , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[18]  Rabab Kreidieh Ward,et al.  A Rotation Invariant Rule-Based Thinning Algorithm for Character Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Venu Govindaraju,et al.  Identifying Handwritten Text in Mixed Documents , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[20]  M. Aurangzeb Khan,et al.  Velocity-Image Model for Online Signature Verification , 2006, IEEE Transactions on Image Processing.

[21]  Robert Sabourin,et al.  A neural network approach to off-line signature verification using directional PDF , 1996, Pattern Recognit..

[22]  Umapada Pal,et al.  A System to Segment Text and Symbols from Color Maps , 2007, GREC.

[23]  Jayant Kumar,et al.  Shape codebook based handwritten and machine printed text zone extraction , 2011, Electronic Imaging.

[24]  Jesús Francisco Vargas-Bonilla,et al.  Off-line signature verification based on grey level information using texture features , 2011, Pattern Recognit..

[25]  Michael R. Lyu,et al.  A Fast 2D Shape Recovery Approach by Fusing Features and Appearance , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.