Analysis of the Logical Layout of Documents

Automatic document understanding is one of the most important tasks when dealing with printed documents since all post-ordered systems require the captured but process-relevant data. Analysis of the logical layout of documents not only enables an automatic conversion into a semantically marked-up electronic representation but also reveals options for developing higher-level functionality like advanced search (e.g., limiting search to titles only), automatic routing of business letters, automatic processing of invoices, and developing link structures to facilitate navigation through books. Over the last three decades, a number of techniques have been proposed to address the challenges arising in logical layout analysis of documents originating from many different domains. This chapter provides a comprehensive review of the state of the art in the field of automated document understanding, highlights key methods developed for different target applications, and provides practical recommendations for designing a document understanding system for the problem at hand.

[1]  Rainer Hoch,et al.  From paper to office document standard representation , 1992, Computer.

[2]  Aurélie Lemaitre,et al.  Multiresolution cooperation makes easier document structure recognition , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[3]  Thomas Kieninger,et al.  Rule-based document structure understanding with a fuzzy combination of layout and textual features , 2001, International Journal on Document Analysis and Recognition.

[4]  Eberhard Mandler,et al.  Document analysis-from pixels to contents , 1992 .

[5]  Francesca Cesarini,et al.  Analysis and understanding of multi-class invoices , 2003, Document Analysis and Recognition.

[6]  Marco Aiello,et al.  Document understanding for a broad class of documents , 2002, Int. J. Document Anal. Recognit..

[7]  Anil K. Jain,et al.  A Generic System for Form Dropout , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Thomas M. Breuel,et al.  Combined orientation and skew detection using geometric text-line modeling , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[9]  Linlin Zhu,et al.  Skew detection in document images based on rectangular active contour , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[10]  Yalin Wang,et al.  Table structure understanding and its performance evaluation , 2004, Pattern Recognit..

[11]  Bertin Klein,et al.  On Benchmarking of Invoice Analysis Systems , 2006, Document Analysis Systems.

[12]  Haruo Asada,et al.  Major components of a complete text reading system , 1992 .

[13]  Thomas M. Breuel,et al.  Document cleanup using page frame detection , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[14]  Bertin Klein,et al.  Problem-adaptable document analysis and understanding for high-volume applications , 2004, Document Analysis and Recognition.

[15]  Jie Zou,et al.  Locating and parsing bibliographic references in HTML medical articles , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[16]  Jean-Luc Meunier,et al.  On tables of contents and how to recognize them , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[17]  Kazuhiko Yamamoto,et al.  Structured Document Image Analysis , 1992, Springer Berlin Heidelberg.

[18]  Xiaofan Lin,et al.  Detection and analysis of table of contents based on content association , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[19]  Mahesh Viswanathan,et al.  Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Luís Torgo,et al.  Design of an end-to-end method to extract information from tables , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[21]  Abdel Belaïd Recognition of table of contents for electronic library consulting , 2001, International Journal on Document Analysis and Recognition.

[22]  Thomas M. Breuel,et al.  The Effect of Border Noise on the Performance of Projection-Based Page Segmentation Methods , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Andreas Dengel,et al.  ANASTASIL: A System for Low-Level and High-Level Geometric Analysis of Printed Documents , 1992 .

[24]  Stefano Messelodi,et al.  Geometric Layout Analysis Techniques for Document Image Understanding: a Review , 2008 .

[25]  Lawrence O'Gorman,et al.  The RightPages image-based electronic library for alerting and browsing , 1992, Computer.

[26]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[27]  Donato Malerba,et al.  Machine Learning for Intelligent Processing of Printed Documents , 2000, Journal of Intelligent Information Systems.

[28]  Pinar Duygulu Sahin,et al.  A hierarchical representation of form documents for identification and retrieval , 2002, International Journal on Document Analysis and Recognition.

[29]  Francesca Cesarini,et al.  INFORMys: A Flexible Invoice-Like Form-Reader System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Chew Lim Tan,et al.  Extraction of newspaper headlines from microfilm for automatic indexing , 2003, Document Analysis and Recognition.

[31]  Andreas Dengel,et al.  High Level Document Analysis Guided by Geometric Aspects , 1988, Int. J. Pattern Recognit. Artif. Intell..

[32]  Gabriella Kazai,et al.  Setting up a competition framework for the evaluation of structure extraction from OCR-ed books , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[33]  Donato Malerba,et al.  Transforming paper documents into XML format with WISDOM++ , 2001, International Journal on Document Analysis and Recognition.

[34]  Eric Medvet,et al.  A probabilistic approach to printed document understanding , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[35]  Mahesh Viswanathan,et al.  A prototype document image analysis system for technical journals , 1992, Computer.

[36]  Abdel Belaïd,et al.  Labelling logical structures of document images using a dynamic perceptive neural network , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[37]  Thomas M. Breuel,et al.  Example-Based Logical Labeling of Document Title Page Images , 2007 .

[38]  Yang Cao,et al.  Using citing information to understand the logical structure of document images , 2001, International Journal on Document Analysis and Recognition.

[39]  Hong Yan,et al.  Location of title and author regions in document images based on the Delaunay triangulation , 2004, Image Vis. Comput..

[40]  Michael Elad,et al.  Biblio: automatic meta-data extraction , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[41]  Ulrich Kressel,et al.  Towards the Understanding of Printed Documents , 1992 .

[42]  Andreas Dengel,et al.  Computer understanding of document structure , 1996 .

[43]  Véronique Eglin,et al.  Analysis and interpretation of visual saliency for document functional labeling , 2004, Document Analysis and Recognition.

[44]  Gabriella Kazai,et al.  Overview of the INEX 2007 Book Search track: BookSearch '07 , 2008, SIGF.