论文信息 - Table Structure Recognition using Top-Down and Bottom-Up Cues

Table Structure Recognition using Top-Down and Bottom-Up Cues

Tables are information-rich structured objects in document images. While significant work has been done in localizing tables as graphic objects in document images, only limited attempts exist on table structure recognition. Most existing literature on structure recognition depends on extraction of meta-features from the PDF document or on the optical character recognition (OCR) models to extract low-level layout features from the image. However, these methods fail to generalize well because of the absence of meta-features or errors made by the OCR when there is a significant variance in table layouts and text organization. In our work, we focus on tables that have complex structures, dense content, and varying layouts with no dependency on meta-features and/or OCR. We present an approach for table structure recognition that combines cell detection and interaction modules to localize the cells and predict their row and column associations with other detected cells. We incorporate structural constraints as additional differential components to the loss function for cell detection. We empirically validate our method on the publicly available real-world datasets - ICDAR-2013, ICDAR-2019 (cTDaR) archival, UNLV, SciTSR, SciTSR-COMP, TableBank, and PubTabNet. Our attempt opens up a new direction for table structure recognition by combining top-down (table cells detection) and bottom-up (structure recognition) cues in visually understanding the tables.

[1] Shoaib Ahmed Siddiqui,et al. Rethinking Semantic Segmentation for Table Structure Recognition in Documents , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[2] R. Smith,et al. An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[3] Yu Fang,et al. ICDAR 2019 Competition on Table Detection and Recognition (cTDaR) , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[4] Lovekesh Vig,et al. TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[5] Atsuhiro Takasu,et al. A Cell-detection-based Table-structure Recognition Method , 2019, DocEng.

[6] Gaurav Harit,et al. Table detection in document images using header and trailer patterns , 2012, ICVGIP '12.

[7] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[8] Yutaro Iiyama,et al. Learning representations of irregular particle-detector geometry with distance-weighted graph networks , 2019, The European Physical Journal C.

[9] Katsuhiko Itonori,et al. Table structure recognition based on textblock arrangement and ruled line position , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[10] C. V. Jawahar,et al. IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents , 2020, DAS.

[11] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Mark Rowan,et al. Extracting Tables from Documents using Conditional Generative Adversarial Networks and Genetic Algorithms , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[14] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15] Brian L. Price,et al. Deep Splitting and Merging for Table Structure Decomposition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[16] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[17] Daniel P. Lopresti,et al. Medium-independent table detection , 1999, Electronic Imaging.

[18] Alicia Fornés,et al. Table Detection in Invoice Documents by Graph Neural Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[19] Ioannis Pratikakis,et al. Automatic Table Detection in Document Images , 2005, ICAPR.

[20] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[21] Thomas Kieninger,et al. An open approach towards the benchmarking of table structure recognition systems , 2010, DAS '10.

[22] Dongmei Zhang,et al. TableSense: Spreadsheet Table Detection with Convolutional Neural Networks , 2019, AAAI.

[23] Darshan Adiga,et al. Table Structure Recognition Based on Cell Relationship, a Bottom-Up Approach , 2019, RANLP.

[24] Richard Zanibbi,et al. Recognizing Mathematical Expressions Using Tree Transformation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[25] Tiejun Zhao,et al. Table-to-Text: Describing Table Region With Natural Language , 2018, AAAI.

[26] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[27] Faisal Shafait,et al. Table Structure Extraction with Bi-Directional Gated Recurrent Unit Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[28] Faisal Shafait,et al. Rethinking Table Recognition using Graph Neural Networks , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[29] Ali Farhadi,et al. FigureSeer: Parsing Result-Figures in Research Papers , 2016, ECCV.

[30] Ersin Yumer,et al. Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Xiao Liu,et al. DeepChart: Combining deep convolutional networks and deep belief networks in chart classification , 2016, Signal Process..

[32] Andreas Dengel,et al. DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[33] Kugatsu Sadamitsu,et al. Understanding the Semantic Structures of Tables with a Hybrid Deep Neural Network Architecture , 2017, AAAI.

[34] Dacheng Tao,et al. ReS2TIM: Reconstruct Syntactic Structures from Table Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[35] Antonio Jimeno-Yepes,et al. Image-based table recognition: data, model, and evaluation , 2020, ECCV.

[36] Krisztian Balog,et al. Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval , 2019, SIGIR.

[37] Matheus Palhares Viana,et al. Fast CNN-Based Document Layout Analysis , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[38] Concetto Spampinato,et al. A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.

[39] Thomas G Kieninger,et al. Table structure recognition based on robust block segmentation , 1998, Electronic Imaging.

[40] Jun Du,et al. Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[41] Heyan Huang,et al. Complicated Table Structure Recognition , 2019, ArXiv.

[42] Véronique Eglin,et al. Recurrent Neural Network Approach for Table Field Extraction in Business Documents , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[43] Faisal Shafait,et al. Rethinking Table Parsing using Graph Neural Networks , 2019, ArXiv.

[44] Yuan Liao,et al. CNN Based Page Object Detection in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[45] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Martin Holecek,et al. Table Understanding in Structured Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).

[47] Martin Holecek,et al. Line-items and table understanding in structured documents , 2019, ArXiv.

[48] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[49] C. V. Jawahar,et al. Graphical Object Detection in Document Images , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[50] Zhoujun Li,et al. TableBank: Table Benchmark for Image-based Table Detection and Recognition , 2019, LREC.

[51] Muhammad Imran Malik,et al. Table Detection Using Deep Learning , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[52] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[54] In So Kweon,et al. Gated bidirectional feature pyramid network for accurate one-shot detection , 2019, Machine Vision and Applications.

[55] Yalin Wang,et al. Table structure understanding and its performance evaluation , 2004, Pattern Recognit..

[56] Tamir Hassan,et al. ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[57] David S. Rosenberg,et al. Challenges in End-to-End Neural Scientific Table Recognition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[58] D. H. Chang,et al. Extracting Tabular Information From Text Files , 1996 .