论文信息 - Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context

Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context

Lucian Popa

D. Burdick

N. Wang

Xinyi Zheng

Abstract:Documents are often used for knowledge sharing and preservation in business and science, within which are tables that capture most of the critical data. Unfortunately, most documents are stored and distributed as PDF or scanned images, which fail to preserve logical table structure. Recent vision-based deep learning approaches have been proposed to address this gap, but most still cannot achieve state-of-the-art results. We present Global Table Extractor (GTE), a vision-guided systematic framework for joint table detection and cell structured recognition, which could be built on top of any object detection model. With GTE-Table, we invent a new penalty based on the natural cell containment constraint of tables to train our table network aided by cell location predictions. GTE-Cell is a new hierarchical cell detection network that leverages table styles. Further, we design a method to automatically label table and cell structure in existing documents to cheaply create a large corpus of training and test data. We use this to enhance PubTabNet with cell labels and create FinTabNet, real-world and complex scientific and financial datasets with detailed table structure annotations to help train and test structure recognition. Our framework surpasses previous state-of-the-art results on the ICDAR 2013 and ICDAR 2019 table competition in both table detection and cell structure recognition. Further experiments demonstrate a greater than 45% improvement in cell structure recognition when compared to a vanilla RetinaNet object detection model in our new out-of-domain FinTabNet.

参考文献

[1] Alexandre V. Evfimievski,et al. Table extraction and understanding for scientific and enterprise applications , 2020, Proc. VLDB Endow..

[2] D. Prasad,et al. CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3] Antonio Jimeno-Yepes,et al. Image-based table recognition: data, model, and evaluation , 2019, ECCV.

[4] Faisal Shafait,et al. Table Structure Extraction with Bi-Directional Gated Recurrent Unit Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[5] Brian L. Price,et al. Deep Splitting and Merging for Table Structure Decomposition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[6] Shoaib Ahmed Siddiqui,et al. DeepTabStR: Deep Learning based Table Structure Recognition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[7] Yibo Li,et al. A YOLO-Based Table Detection Method , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[8] Yu Fang,et al. ICDAR 2019 Competition on Table Detection and Recognition (cTDaR) , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[9] Charles Jochim,et al. Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction , 2019, ACL.

[10] Zhicheng Liu,et al. Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices , 2019, CHI.

[11] Zhoujun Li,et al. TableBank: Table Benchmark for Image-based Table Detection and Recognition , 2019, LREC.

[12] Andreas Dengel,et al. DeCNT: Deep Deformable CNN for Table Detection , 2018, IEEE Access.

[13] Concetto Spampinato,et al. A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.

[14] Waleed Ammar,et al. Extracting Scientific Figures with Distantly Supervised Neural Networks , 2018, JCDL.

[15] Andreas Dengel,et al. DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[16] Muhammad Imran Malik,et al. Table Detection Using Deep Learning , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[17] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] David Sontag,et al. Learning a Health Knowledge Graph from Electronic Medical Records , 2017, Scientific Reports.

[19] Serge J. Belongie,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Alexey O. Shigarov,et al. Configurable Table Structure Recognition in Untagged PDF documents , 2016, DocEng.

[21] Zhi Tang,et al. A Table Detection Method for PDF Documents Based on Convolutional Neural Networks , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[22] Hye-Young Paik,et al. TEXUS: A Task-based Approach for Table Extraction and Understanding , 2015, DocEng.

[23] Percy Liang,et al. Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.

[24] Miao Fan,et al. Table Region Detection on Large-scale PDF Files without Labeled Data , 2015, ArXiv.

[25] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[28] Tamir Hassan,et al. ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[29] Clément Chatelain,et al. Learning to Detect Tables in Scanned Document Images Using Line Information , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[30] Giorgio Orsi,et al. A methodology for evaluating algorithms for table understanding in PDF documents , 2012, DocEng '12.

[31] Ruiheng Qiu,et al. A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures , 2011, 2011 International Conference on Document Analysis and Recognition.

[32] Faisal Shafait,et al. Table detection in heterogeneous documents , 2010, DAS '10.

[33] Thomas Kieninger,et al. An open approach towards the benchmarking of table structure recognition systems , 2010, DAS '10.

[34] Kun Bai,et al. TableSeer: automatic table metadata extraction and searching in digital libraries , 2007, JCDL '07.

[35] Ioannis Pratikakis,et al. Automatic Table Detection in Document Images , 2005, ICAPR.

[36] Yalin Wang,et al. Table structure understanding and its performance evaluation , 2004, Pattern Recognit..

[37] Francesca Cesarini,et al. Trainable table location in document images , 2002, Object recognition supported by user interaction for service robots.

[38] Yasushi Makihara,et al. Object recognition supported by user interaction for service robots , 2002, Object recognition supported by user interaction for service robots.

[39] Daniel P. Lopresti,et al. Medium-independent table detection , 1999, Electronic Imaging.

[40] Thomas Kieninger,et al. The T-Recs Table Recognition and Analysis System , 1998, Document Analysis Systems.

[41] Y. Hirayama,et al. A method for table structure analysis using DP matching , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[42] Edward A. Lee,et al. Parts that add up to a whole : a framework for the analysis of tables , 2007 .

[43] Katharina Kaiser,et al. pdf2table: A Method to Extract Table Information from PDF Files , 2005, IICAI.

[44] Ana Costa e Silva,et al. 2009 10th International Conference on Document Analysis and Recognition Learning Rich Hidden Markov Models in Document Analysis: Table Location , 2022 .

引用

LakeBench: Benchmarks for Data Discovery over Data Lakes

ArXiv

2023

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

Pattern Recognit.

2023

Show, Read and Reason: Table Structure Recognition with Flexible Context Aggregator

ACM Multimedia

2021

Parsing Table Structures in the Wild

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

2021

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

J. Cybersecur. Priv.

2022

Beyond Document Page Classification: Design, Datasets, and Challenges

ArXiv

2023

AttesTable at SemEval-2021 Task 9: Extending Statement Verification with Tables for Unknown Class, and Semantic Evidence Finding

SEMEVAL

2021

Design of a Novel Information System for Semi-automated Management of Cybersecurity in Industrial Control Systems

ACM Trans. Manag. Inf. Syst.

2022

Retrieving Complex Tables with Multi-Granular Graph Representation Learning

SIGIR

2021

FinQA: A Dataset of Numerical Reasoning over Financial Data

EMNLP

2021

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks

EMNLP

2023

DocILE 2023 Teaser: Document Information Localization and Extraction

ECIR

2023

Current Status and Performance Analysis of Table Recognition in Document Images With Deep Neural Networks

IEEE Access

2021

TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement

AAAI

2023

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

MMIR@MM

2023

CORD-19: The Covid-19 Open Research Dataset

NLPCOVID19

2020

WEATHERGOV+: A Table Recognition and Summarization Dataset to Bridge the Gap Between Document Image Analysis and Natural Language Generation

DocEng

2023

Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context

LakeBench: Benchmarks for Data Discovery over Data Lakes

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

DocTr: Document Transformer for Structured Information Extraction in Documents

SciA11y: Converting Scientific Papers to Accessible HTML

KAAPA: Knowledge Aware Answers from PDF Analysis

Show, Read and Reason: Table Structure Recognition with Flexible Context Aggregator

Parsing Table Structures in the Wild

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

Beyond Document Page Classification: Design, Datasets, and Challenges

AttesTable at SemEval-2021 Task 9: Extending Statement Verification with Tables for Unknown Class, and Semantic Evidence Finding

Design of a Novel Information System for Semi-automated Management of Cybersecurity in Industrial Control Systems

Retrieving Complex Tables with Multi-Granular Graph Representation Learning

FinQA: A Dataset of Numerical Reasoning over Financial Data

Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks

DocILE 2023 Teaser: Document Information Localization and Extraction

Current Status and Performance Analysis of Table Recognition in Document Images With Deep Neural Networks

TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

CORD-19: The Covid-19 Open Research Dataset

WEATHERGOV+: A Table Recognition and Summarization Dataset to Bridge the Gap Between Document Image Analysis and Natural Language Generation