Signature Segmentation from Machine Printed Documents Using Conditional Random Field

Automatic separation of signatures from a document page involves difficult challenges due to the free-flow nature of handwriting, overlapping/touching of signature parts with printed text, noise, etc. In this paper, we have proposed a novel approach for the segmentation of signatures from machine printed signed documents. The algorithm first locates the signature block in the document using word level feature extraction. Next, the signature strokes that touch or overlap with the printed texts are separated. A stroke level classification is then performed using skeleton analysis to separate the overlapping strokes of printed text from the signature. Gradient based features and Support Vector Machine (SVM) are used in our scheme. Finally, a Conditional Random Field (CRF) model energy minimization concept based on approximated labeling by graph cut is applied to label the strokes as "signature" or "printed text" for accurate segmentation of signatures. Signature segmentation experiment is performed in "tobacco" dataset1 and we have obtained encouraging results.

[1]  Rabab Kreidieh Ward,et al.  A Rotation Invariant Rule-Based Thinning Algorithm for Character Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Venu Govindaraju,et al.  Identifying Handwritten Text in Mixed Documents , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[3]  Jinhong Katherine Guo,et al.  Separating handwritten material from machine printed text using hidden Markov models , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[4]  M. Aurangzeb Khan,et al.  Velocity-Image Model for Online Signature Verification , 2006, IEEE Transactions on Image Processing.

[5]  Venu Govindaraju,et al.  Overlapped text segmentation using Markov random field and aggregation , 2010, DAS '10.

[6]  Venu Govindaraju,et al.  Markov Random Field Based Text Identification from Annotated Machine Printed Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[7]  Jacques P. Swanepoel,et al.  Off-line Signature Verification Using Flexible Grid Features and Classifier Fusion , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[8]  Jesús Francisco Vargas-Bonilla,et al.  The 4NSigComp2010 Off-line Signature Verification Competition: Scenario 2 , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Tetsushi Wakabayashi,et al.  Handwritten Numeral Recognition of Six Popular Indian Scripts , 2007 .

[11]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Sargur N. Srihari,et al.  Segmentation and labeling of documents using conditional random fields , 2007, Electronic Imaging.

[13]  Jesús Francisco Vargas-Bonilla,et al.  Off-line signature verification based on grey level information using texture features , 2011, Pattern Recognit..