论文引用

William A. Barrett, Heath E. Nielson,
Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.

Zoning documents increases the resolution of indexingfrom the image level to the field level. A line-delimited tabulardocument forms a well defined series of regions. However,as image quality decrease...

Within the scientific literature, tables are commonly used to present factual and statistical information in a compact way, which is easy to digest by readers. The ability to “understand” the structur...

Lovekesh Vig, Monika Sharma, Shubham Paliwal et al.,
2019 International Conference on Document Analysis and Recognition (ICDAR)

With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insur...

We present a novel methodology for extracting the structure of handwritten filled table-forms. The method identifies the table-form line intersections, detecting and correcting wrong intersections pro...

Zhoujun Li, Ming Zhou, Furu Wei et al.,
2019,
LREC

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table...

Translating document renderings (e.g. PDFs, scans) into hierarchical structures is extensively demanded in the daily routines of many real-world applications, and is often a prerequisite step of many ...

Thomas Kieninger, Andreas Dengel, A. Dengel et al.,
2001,
Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper summarizes the core idea of the T-Recs table recognition system, an integrated system covering block-segmentation, table location and a model-free structural analysis of tables. T-Recs work...

Luiz Antônio Pereira Neves, Rafaela Dandolini Felipe,
2008 The Eighth IAPR International Workshop on Document Analysis Systems

This paper presents an approach to extract the structure of pre-printed and hand-filled table-forms. The first module performs the cell identification based on Watershed transform. A second module det...

Shoaib Ahmed Siddiqui, Andreas Dengel, Sheraz Ahmed et al.,
2019 International Conference on Document Analysis and Recognition (ICDAR)

This paper presents a novel method for the analysis of tabular structures in document images using the potential of deformable convolutional networks. In order to assess the suitability of the model t...

Andreas Dengel, Sheraz Ahmed, Ivo Wolf et al.,
2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

This paper presents a novel end-to-end system for table understanding in document images called DeepDeSRT. In particular, the contribution of DeepDeSRT is two-fold. First, it presents a deep learning-...

Thomas Kieninger, Bertin Klein, Andreas Dengel et al.,
2001,
Proceedings of Sixth International Conference on Document Analysis and Recognition

This paper introduces three approaches for an industrial, comprehensive document analysis system to enable it to spot tables in documents. Searching for a set of known table headers (approach 1) works...

Urszula Markowska-Kaczmar, Mariusz Paradowski, Michal Spytkowski et al.,
2014,
Pattern Analysis and Applications

The paper presents a complete solution for recognition of textual and graphic structures in various types of documents acquired from the Internet. In the proposed approach, the document structure reco...

Mario Alviano, Nicola Leone, Marco Manna et al.,
2012,
Trans. Large Scale Data Knowl. Centered Syst.

The explosive growth and popularity of the Web has resulted in a huge amount of digital information sources on the Internet. Unfortunately, such sources only manage data, rather than the knowledge the...

Faisal Shafait, Muhammad Ali Shahzad, Saqib Ali Khan et al.,
2019 International Conference on Document Analysis and Recognition (ICDAR)

Tables present summarized and structured information to the reader, which makes table's structure extraction an important part of document understanding applications. However, table structure identifi...

Andreas Dengel, Syed Saqib Bukhari, Martin Jenckel et al.,
2019 International Conference on Document Analysis and Recognition Workshops (ICDARW)

Table localization and segmentation is an important but critical step in document image analysis. Table segmentation is much harder than table localization particularly in the invoice document because...

C. V. Jawahar, Ajoy Mondal, Madhav Agarwal et al.,
2020 25th International Conference on Pattern Recognition (ICPR)

Localizing page elements/objects such as tables, figures, equations, etc. is the primary step in extracting information from document images. We propose a novel end-to-end trainable deep network, (CDe...

William A. Barrett, Dallan Quass, Heath E. Nielson et al.,
First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings.

Large-scale, multiterabyte digital libraries are becoming feasible due to decreasing costs of storage, CPU, and bandwidth. However, costs associated with preparing content for input into the library r...

Brigitte Mathiak, Silke Eckstein, Andreas Kupfer et al.,
2008,
Mining Complex Data

It is said that the world knowledge is in the Internet. Scientific knowledge is in the books, journals and conference proceedings. Yet both repositories are too large to skim through manually. We need...

Paul Lukowicz, Muhammad Imran Malik, Sheraz Ahmed et al.,
2019 Digital Image Computing: Techniques and Applications (DICTA)

In this work, we present a novel and generic approach, Figure and Formula Detector (FFD) to detect the formulas and figures from document images. Our proposed method employs traditional computer visio...

K. C. Santosh, K. Santosh,
2015,
International Journal on Document Analysis and Recognition (IJDAR)

In this paper, we present document information content (i.e. text fields) extraction technique via graph mining. Real-world users first provide a set of key text fields from the document image which t...