Image-based logical document structure recognition

The paper presents a complete solution for recognition of textual and graphic structures in various types of documents acquired from the Internet. In the proposed approach, the document structure recognition problem is divided into sub-problems. The first one is localizing logical structure elements within the document. The second one is recognizing segmented logical structure elements. The input to the method is an image of document page, the output is the XML file containing all graphic and textual elements included in the document, preserving the reading order of document blocks. This file contains information about the identity and position of all logical elements in the document image. The paper describes all details of the proposed method and shows the results of the experiments validating its effectiveness. The results of the proposed method for paragraph structure recognition are comparable to the referenced methods which offer segmentation only.

[1]  Adnan Amin,et al.  Page segmentation and classification utilising a bottom-up approach , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[2]  Yalin Wang,et al.  Automatic table ground truth generation and a background-analysis-based table structure extraction method , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[3]  Utpal Garain,et al.  EMERS: a tree matching–based performance evaluation of mathematical expression recognition systems , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[4]  H. Emptoz,et al.  A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[5]  Ana Costa e Silva,et al.  2009 10th International Conference on Document Analysis and Recognition Learning Rich Hidden Markov Models in Document Analysis: Table Location , 2022 .

[6]  Rohit Prasad,et al.  Robust Page Segmentation Based on Smearing and Error Correction Unifying Top-down and Bottom-up Approaches , 2007 .

[7]  Thomas M. Breuel,et al.  Two Geometric Algorithms for Layout Analysis , 2002, Document Analysis Systems.

[8]  Giovanni Soda,et al.  Artificial neural networks for document analysis and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  R. Rangayyan Biomedical Image Analysis , 2004 .

[10]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Hanan Samet,et al.  A general approach to connected-component labeling for arbitrary image representations , 1992, JACM.

[12]  Nikos A. Nikolaou,et al.  Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths , 2010, Image Vis. Comput..

[13]  Adnan Amin,et al.  Page Segmentation and Classification Utilizing Bottom-Up Approach , 2001, Int. J. Image Graph..

[14]  Thomas Kieninger,et al.  The T-Recs Table Recognition and Analysis System , 1998, Document Analysis Systems.

[15]  A. Peter Johnson,et al.  A Fast Algorithm for Bottom-Up Document Layout Analysis , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Boontee Kruatrachue,et al.  Fast Document Segmentation Using Contour and X-Y Cut Technique , 2005, WEC.

[17]  Urszula Markowska-Kaczmar,et al.  Towards Automatic Image Annotation Supporting Document Understanding , 2011, HAIS.

[18]  Jean-Luc Meunier,et al.  Optimized XY-cut for determining a page reading order , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[19]  Dov Dori,et al.  The representation of document structure: a generic object-process analysis , 1995 .

[20]  Yolande Belaïd,et al.  Administrative Document Analysis and Structure , 2011, Learning Structure and Schemas from Documents.

[21]  B. Kruatrachue,et al.  A fast and efficient method for document segmentation for OCR , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[22]  Charalambos Strouthopoulos,et al.  Text identification for document image analysis using a neural network , 1998, Image Vis. Comput..

[23]  Francesca Cesarini,et al.  Trainable table location in document images , 2002, Object recognition supported by user interaction for service robots.

[24]  Ioannis Pratikakis,et al.  Automatic Table Detection in Document Images , 2005, ICAPR.

[25]  Azriel Rosenfeld,et al.  Document structure analysis algorithms: a literature survey , 2003, IS&T/SPIE Electronic Imaging.

[26]  Daniel P. Lopresti,et al.  Medium-independent table detection , 1999, Electronic Imaging.

[27]  Dorothea Blostein,et al.  Handbook of Character Recognition and Document Image Analysis , 1997 .

[28]  Robert M. Haralick,et al.  Consistent Partition and Labelling of Text Blocks , 2000, Pattern Analysis & Applications.

[29]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[30]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..

[31]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  David S. Doermann,et al.  Voronoi++: A Dynamic Page Segmentation Approach Based on Voronoi and Docstrum Features , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[33]  Akira Shimazu,et al.  Learning Logical Structures of Paragraphs in Legal Articles , 2011, IJCNLP.

[34]  Sargur N. Srihari,et al.  Knowledge-based derivation of document logical structure , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[35]  Thomas Kieninger,et al.  An approach towards benchmarking of table structure recognition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[36]  Fatos Xhafa,et al.  Learning Structure and Schemas from Documents , 2011, Studies in Computational Intelligence.

[37]  Peter Fankhauser,et al.  MarkItUp! An Incremental Approach to Document Structure Recognition , 1993, Electron. Publ..

[38]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Luc M. Vincent,et al.  Benchmarking page segmentation algorithms , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Yalin Wang,et al.  Document zone content classification and its performance evaluation , 2006, Pattern Recognit..

[41]  T. Shinogi,et al.  A Study on Document Structure Recognition of Discharge Summaries for Analogous Case Search System , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[42]  Faisal Shafait,et al.  Table detection in heterogeneous documents , 2010, DAS '10.

[43]  Francine Chen,et al.  Picture detection in document page images , 2010, DocEng '10.

[44]  Yannis Dimitriadis,et al.  Neuro-fuzzy ART-based document management system: application to mail distribution and digital libraries , 2002 .

[45]  George Nagy,et al.  HIERARCHICAL REPRESENTATION OF OPTICALLY SCANNED DOCUMENTS , 1984 .