论文引用

William A. Barrett, Heath E. Nielson,

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.

Zoning documents increases the resolution of indexingfrom the image level to the field level. A line-delimited tabulardocument forms a well defined series of regions. However,as image quality decrease...

Disentangling the Structure of Tables in Scientific Literature

Goran Nenadic, Robert Hernandez, Nikola Milosevic et al.,

2016,

NLDB

Within the scientific literature, tables are commonly used to present factual and statistical information in a compact way, which is easy to digest by readers. The ability to “understand” the structur...

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images

Lovekesh Vig, Monika Sharma, Shubham Paliwal et al.,

2019 International Conference on Document Analysis and Recognition (ICDAR)

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insur...

A table-form extraction with artefact removal

Flávio Bortolozzi, Jacques Facon, Luiz Antônio Pereira Neves et al.,

2007,

SAC '07

We present a novel methodology for extracting the structure of handwritten filled table-forms. The method identifies the table-form line intersections, detecting and correcting wrong intersections pro...

TableBank: Table Benchmark for Image-based Table Detection and Recognition

Zhoujun Li, Ming Zhou, Furu Wei et al.,

2019,

LREC

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table...

DocParser: Hierarchical Structure Parsing of Document Renderings

Stefan Feuerriegel, Ce Zhang, Johannes Rausch et al.,

2019,

ArXiv

Translating document renderings (e.g. PDFs, scans) into hierarchical structures is extensively demanded in the daily routines of many real-world applications, and is often a prerequisite step of many ...

Applying the T-Recs table recognition system to the business letter domain

Thomas Kieninger, Andreas Dengel, A. Dengel et al.,

2001,

Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper summarizes the core idea of the T-Recs table recognition system, an integrated system covering block-segmentation, table location and a model-free structural analysis of tables. T-Recs work...

Pre-Printed and Hand-Filled Table-Form Analysis Aiming Cell Extraction

Luiz Antônio Pereira Neves, Rafaela Dandolini Felipe,

2008 The Eighth IAPR International Workshop on Document Analysis Systems

This paper presents an approach to extract the structure of pre-printed and hand-filled table-forms. The first module performs the cell identification based on Watershed transform. A second module det...

DeepTabStR: Deep Learning based Table Structure Recognition

Shoaib Ahmed Siddiqui, Andreas Dengel, Sheraz Ahmed et al.,

2019 International Conference on Document Analysis and Recognition (ICDAR)

This paper presents a novel method for the analysis of tabular structures in document images using the potential of deformable convolutional networks. In order to assess the suitability of the model t...

DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images

Andreas Dengel, Sheraz Ahmed, Ivo Wolf et al.,

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

This paper presents a novel end-to-end system for table understanding in document images called DeepDeSRT. In particular, the contribution of DeepDeSRT is two-fold. First, it presents a deep learning-...

Three approaches to "industrial" table spotting

Thomas Kieninger, Bertin Klein, Andreas Dengel et al.,

2001,

Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper introduces three approaches for an industrial, comprehensive document analysis system to enable it to spot tables in documents. Searching for a set of known table headers (approach 1) works...

Image-based logical document structure recognition

Urszula Markowska-Kaczmar, Mariusz Paradowski, Michal Spytkowski et al.,

2014,

Pattern Analysis and Applications

The paper presents a complete solution for recognition of textual and graphic structures in various types of documents acquired from the Internet. In the proposed approach, the document structure reco...

The HiLeX System for Semantic Information Extraction

Mario Alviano, Nicola Leone, Marco Manna et al.,

2012,

Trans. Large Scale Data Knowl. Centered Syst.

The explosive growth and popularity of the Web has resulted in a huge amount of digital information sources on the Internet. Unfortunately, such sources only manage data, rather than the knowledge the...

Table Structure Extraction with Bi-Directional Gated Recurrent Unit Networks

Faisal Shafait, Muhammad Ali Shahzad, Saqib Ali Khan et al.,

2019 International Conference on Document Analysis and Recognition (ICDAR)

Tables present summarized and structured information to the reader, which makes table's structure extraction an important part of document understanding applications. However, table structure identifi...

Table Localization and Segmentation using GAN and CNN

Andreas Dengel, Syed Saqib Bukhari, Martin Jenckel et al.,

2019 International Conference on Document Analysis and Recognition Workshops (ICDARW)

Table localization and segmentation is an important but critical step in document image analysis. Table segmentation is much harder than table localization particularly in the invoice document because...

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

C. V. Jawahar, Ajoy Mondal, Madhav Agarwal et al.,

2020 25th International Conference on Pattern Recognition (ICPR)

Localizing page elements/objects such as tables, figures, equations, etc. is the primary step in extracting information from document images. We propose a novel end-to-end trainable deep network, (CDe...

Digital mountain: from granite archive to global access

William A. Barrett, Dallan Quass, Heath E. Nielson et al.,

First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings.

Large-scale, multiterabyte digital libraries are becoming feasible due to decreasing costs of storage, CPU, and bandwidth. However, costs associated with preparing content for input into the library r...

Using Layout Data for the Analysis of Scientific Literature

Brigitte Mathiak, Silke Eckstein, Andreas Kupfer et al.,

2008,

Mining Complex Data

It is said that the world knowledge is in the Internet. Scientific knowledge is in the books, journals and conference proceedings. Yet both repositories are too large to skim through manually. We need...

FFD: Figure and Formula Detection from Document Images

Paul Lukowicz, Muhammad Imran Malik, Sheraz Ahmed et al.,

2019 Digital Image Computing: Techniques and Applications (DICTA)

In this work, we present a novel and generic approach, Figure and Formula Detector (FFD) to detect the formulas and figures from document images. Our proposed method employs traditional computer visio...

g-DICE: graph mining-based document information content exploitation

K. C. Santosh, K. Santosh,

2015,

International Journal on Document Analysis and Recognition (IJDAR)

In this paper, we present document information content (i.e. text fields) extraction technique via graph mining. Real-world users first provide a set of key text fields from the document image which t...