TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research. Considering large-scale tabular data in online and offline documents, automatic table recognition has attracted increasing attention from the document analysis community. Though human can easily understand the structure of tables, it remains a challenge for machines to understand that, especially due to a variety of different table layouts and styles. Existing methods usually model a table as either the markup sequence or the adjacency matrix between different table cells, failing to address the importance of the logical location of table cells, e.g., a cell is located in the first row and the second column of the table. In this paper, we reformulate the problem of table structure recognition as the table graph reconstruction, and propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells. Experimental results on three popular table recognition datasets and a new dataset with table graph annotations (TableGraph-350K) demonstrate the effectiveness of the proposed TGRNet for table structure recognition. Code and annotations will be made publicly available at https://github.com/xuewenyuan/TGRNet.

[1]  Lovekesh Vig,et al.  TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Yu Fang,et al.  ICDAR 2019 Competition on Table Detection and Recognition (cTDaR) , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[4]  Hye-Young Paik,et al.  TEXUS: A Task-based Approach for Table Extraction and Understanding , 2015, DocEng.

[5]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Zhoujun Li,et al.  TableBank: Table Benchmark for Image-based Table Detection and Recognition , 2019, LREC.

[7]  Jinquan Zeng,et al.  GPS-Net: Graph Property Sensing Network for Scene Graph Generation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yolande Belaïd,et al.  An Invoice Reading System Using a Graph Convolutional Network , 2018, ACCV Workshops.

[9]  Tamir Hassan,et al.  ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[10]  Faisal Shafait,et al.  Rethinking Table Recognition using Graph Neural Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[11]  Zheng Huang,et al.  GFTE: Graph-based Financial Table Extraction , 2020, ICPR Workshops.

[12]  D. Prasad,et al.  CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Zhen Zhang,et al.  Table Analysis and Information Extraction for Medical Laboratory Reports , 2018, 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech).

[14]  Shoaib Ahmed Siddiqui,et al.  DeepTabStR: Deep Learning based Table Structure Recognition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[15]  David S. Rosenberg,et al.  Challenges in End-to-End Neural Scientific Table Recognition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[16]  Alexey O. Shigarov,et al.  Configurable Table Structure Recognition in Untagged PDF documents , 2016, DocEng.

[17]  Dacheng Tao,et al.  Empowering Things With Intelligence: A Survey of the Progress, Challenges, and Opportunities in Artificial Intelligence of Things , 2020, IEEE Internet of Things Journal.

[18]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Faisal Shafait,et al.  Table Structure Extraction with Bi-Directional Gated Recurrent Unit Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[20]  Thomas Muller,et al.  TaPas: Weakly Supervised Table Parsing via Pre-training , 2020, ACL.

[21]  Gang Hua,et al.  Ordinal Regression with Multiple Output CNN for Age Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Mihaela van der Schaar,et al.  VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain , 2020, NeurIPS.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Massimo Ruffolo,et al.  PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[26]  Heyan Huang,et al.  Complicated Table Structure Recognition , 2019, ArXiv.

[27]  Brian L. Price,et al.  Deep Splitting and Merging for Table Structure Decomposition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[28]  Andreas Dengel,et al.  DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[29]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[31]  Pierre Dognin,et al.  Tabular Transformers for Modeling Multivariate Time Series , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Graham Neubig,et al.  TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data , 2020, ACL.

[33]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[34]  Shoaib Ahmed Siddiqui,et al.  Rethinking Semantic Segmentation for Table Structure Recognition in Documents , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[35]  Dacheng Tao,et al.  ReS2TIM: Reconstruct Syntactic Structures from Table Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).