论文信息 - TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices is becoming more acute. A major hurdle to this objective is that these images often contain information in the form of tables and extracting data from tabular sub-images presents a unique set of challenges. This includes accurate detection of the tabular region within an image, and subsequently detecting and extracting information from the rows and columns of the detected table. While some progress has been made in table detection, extracting the table contents is still a challenge since this involves more fine grained table structure(rows & columns) recognition. Prior approaches have attempted to solve the table detection and structure recognition problems independently using two separate models. In this paper, we propose TableNet: a novel end-to-end deep learning model for both table detection and structure recognition. The model exploits the interdependence between the twin tasks of table detection and table structure recognition to segment out the table and column regions. This is followed by semantic rule-based row extraction from the identified tabular sub-regions. The proposed model and extraction approach was evaluated on the publicly available ICDAR 2013 and Marmot Table datasets obtaining state of the art results. Additionally, we demonstrate that feeding additional semantic features further improves model performance and that the model exhibits transfer learning across datasets. Another contribution of this paper is to provide additional table structure annotations for the Marmot data, which currently only has annotations for table detection.

[1] Isabelle Guyon,et al. DATA SETS FOR OCR AND DOCUMENT IMAGE UNDERSTANDING RESEARCH , 1997 .

[2] Yiming Yang,et al. Learning Table Extraction from Examples , 2004, COLING.

[3] Andreas Dengel,et al. DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[4] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] David W. Embley,et al. Table-processing paradigms: a research survey , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[6] In Seop Na,et al. Table Detection from Document Image using Vertical Arrangement of Text Blocks , 2015 .

[7] R. Smith,et al. An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[8] Concetto Spampinato,et al. A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.

[9] Thomas Kieninger,et al. The T-Recs Table Recognition and Analysis System , 1998, Document Analysis Systems.

[10] Srikrishna Varadarajan,et al. Multidomain Document Layout Understanding using Few Shot Object Detection , 2018, ICIAR.

[11] Ana Costa e Silva,et al. Metrics for evaluating performance in document analysis: application to tables , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[12] Francesca Cesarini,et al. Trainable Table Location in Document Images , 2002, ICPR.

[13] Zhi Tang,et al. Table Header Detection and Classification , 2012, AAAI.

[14] M·拉斯科维克,et al. Borderless table detection engine , 2012 .

[15] Yalin Wang,et al. Table structure understanding and its performance evaluation , 2004, Pattern Recognit..

[16] Tamir Hassan,et al. ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[17] Ana Costa e Silva,et al. 2009 10th International Conference on Document Analysis and Recognition Learning Rich Hidden Markov Models in Document Analysis: Table Location , 2022 .

[18] Ying Liu,et al. Dataset, Ground-Truth and Performance Metrics for Table Detection Evaluation , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[19] W. Bruce Croft,et al. TINTIN: a system for retrieval in text tables , 1997, DL '97.

[20] Clément Chatelain,et al. Learning to Detect Tables in Scanned Document Images Using Line Information , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[21] Francesca Cesarini,et al. Trainable table location in document images , 2002, Object recognition supported by user interaction for service robots.