Document seal detection using GHT and character proximity graphs

This paper deals with automatic detection of seal (stamp) from documents with cluttered background. Seal detection involves a difficult challenge due to its multi-oriented nature, arbitrary shape, overlapping of its part with signature, noise, etc. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors computed from recognition result of individual connected components (characters). Scale and rotation invariant features are used in a Support Vector Machine (SVM) classifier to recognize multi-scale and multi-oriented text characters. The concept of generalized Hough transform (GHT) is used to detect the seal and a voting scheme is designed for finding possible location of the seal in a document based on the spatial feature descriptor of neighboring component pairs. The peak of votes in GHT accumulator validates the hypothesis to locate the seal in a document. Experiment is performed in an archive of historical documents of handwritten/printed English text. Experimental results show that the method is robust in locating seal instances of arbitrary shape and orientation in documents, and also efficient in indexing a collection of documents for retrieval purposes.

[1]  Katsuhiko Ueda,et al.  Automatic verification system for seal imprints on Japanese bankchecks , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[2]  Ari Visa,et al.  Shape recognition of irregular objects , 1996, Other Conferences.

[3]  Masakazu Iwamura,et al.  Camera-based document image retrieval as voting for partial signatures of projective invariants , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[4]  Bart Lamiroy,et al.  Graphics recognition - from re-engineering to retrieval , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5]  David S. Doermann,et al.  A robust stamp detection framework on degraded documents , 2006, Electronic Imaging.

[6]  T. Matsuura,et al.  Rotation invariant seal imprint verification method , 2002, 9th International Conference on Electronics, Circuits and Systems.

[7]  Hongbin Zha,et al.  Automatic seal image retrieval method by using shape features of Chinese characters , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[8]  Takahiko Horiuchi,et al.  Automatic seal verification using three-dimensional reference seals , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[9]  Yi-Wu J. Chiang,et al.  SEAL IDENTIFICATION USING THE DELAUNAY TESSELLATION , 1998 .

[10]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[11]  Qian Zhang,et al.  An automatic seal imprint verification approach , 1995, Pattern Recognit..

[12]  Umapada Pal,et al.  Multi-Oriented and Multi-Sized Touching Character Segmentation Using Dynamic Programming , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[13]  Yung-Sheng Chen,et al.  Automatic identification for a Chinese seal image , 1996, Pattern Recognit..

[14]  Jacques Labiche,et al.  Symbol and character recognition: application to engineering drawings , 2000, International Journal on Document Analysis and Recognition.

[15]  Takahiko Horiuchi,et al.  Automatic seal verification by evaluating positive cost , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Takenobu Matsuura,et al.  Seal imprint verification with rotation invariance , 2004, The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, 2004. Proceedings..

[18]  David S. Doermann,et al.  Logo Matching for Document Image Retrieval , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[19]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[20]  Shijian Lu,et al.  Retrieval of machine-printed Latin documents through Word Shape Coding , 2008, Pattern Recognit..

[21]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[22]  Bart Lamiroy,et al.  Text/Graphics Separation Revisited , 2002, Document Analysis Systems.

[23]  Atilla Baskurt,et al.  Generalizations of angular radial transform for 2D and 3D shape retrieval , 2005, Pattern Recognit. Lett..

[24]  Josep Lladós,et al.  Word and Symbol Spotting Using Spatial Organization of Local Descriptors , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[25]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Josep Lladós,et al.  A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives , 2008, J. Univers. Comput. Sci..

[27]  R. Manmatha,et al.  Word spotting for historical documents , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[28]  Aureli Soria-Frisch,et al.  The fuzzy integral for color seal segmentation on document images , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[29]  Jin Hyung Kim,et al.  Attributed stroke graph matching for seal imprint verification , 1989, Pattern Recognit. Lett..

[30]  Jitendra Malik,et al.  Shape Matching and Object Recognition , 2006, Toward Category-Level Object Recognition.

[31]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[32]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[33]  Wen Gao,et al.  A system for automatic Chinese seal imprint verification , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[34]  Umapada Pal,et al.  A System to Segment Text and Symbols from Color Maps , 2007, GREC.

[35]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[36]  A. G. Ramakrishnan,et al.  Automatic Seal Information Reader , 2007, 2007 International Conference on Computing: Theory and Applications (ICCTA'07).

[37]  Alireza Khotanzad,et al.  Invariant Image Recognition by Zernike Moments , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Yung-Sheng Chen Registration of Seal Images Using Contour Analysis , 2003, SCIA.