Segmentation of Scanned Documents Using Deep-Learning Approach

In the paper we present an approach to the automatic segmentation of interesting elements from paper documents i.e. stamps, logos, printed text blocks, signatures, and tables. Presented approach involves object detection by means of Convolutional Neural Network. Resulting regions are then subjected to integration based on confidence level and shape. Experiments performed on representative set of digitizsed paper documents proved usefulness and efficiency of the developed approach. The results were compared with the standard cascade-based detection and showed the superiority of the CNN-based approach.

[1]  Anil K. Jain,et al.  Automatic Caption Localization in Compressed Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3]  Matti Pietikäinen,et al.  Page segmentation and classification using fast feature extraction and connectivity analysis , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[4]  Matti Pietikäinen,et al.  Edge-based method for text detection from complex document images , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[5]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..

[6]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[7]  Daniel P. Lopresti,et al.  Evaluating the performance of table processing algorithms , 2002, International Journal on Document Analysis and Recognition.

[8]  Thomas M. Breuel,et al.  Document image zone classification - a simple high-performance approach , 2007, VISAPP.

[9]  Jules-Raymond Tapamo,et al.  A texture-based method for document segmentation and classification , 2006, South Afr. Comput. J..

[10]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Adam Marchewka,et al.  Extraction of Data from Limnigraf Chart Images , 2013, IP&C.

[12]  David S. Doermann,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Signature Detection and Matching , 2022 .

[13]  R. Gerdes,et al.  Fast and robust recognition and localization of 2-D objects , 1995 .

[14]  Matti Pietikäinen,et al.  Page Segmentation and Zone Classification: The State of the Art , 1999 .

[15]  Robert M. Haralick,et al.  Extraction of text lines and text blocks on document images based on statistical modeling , 1996 .

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  Pawel Forczmanski,et al.  Two-stage approach to extracting visual objects from paper documents , 2016, Machine Vision and Applications.

[19]  Yalin Wang,et al.  Document zone content classification and its performance evaluation , 2006, Pattern Recognit..

[20]  Qifeng Liu,et al.  A stroke filter and its application to text localization , 2009, Pattern Recognit. Lett..

[21]  Matti Pietikäinen,et al.  Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2000, ECCV.

[22]  Paweł Forczmański,et al.  Stamps Detection and Classification Using Simple Features Ensemble , 2015 .