论文信息 - DeCNT: Deep Deformable CNN for Table Detection

DeCNT: Deep Deformable CNN for Table Detection

This paper presents a novel approach for the detection of tables present in documents, leveraging the potential of deep neural networks. Conventional approaches for table detection rely on heuristics that are error prone and specific to a dataset. In contrast, the presented approach harvests the potential of data to recognize tables of arbitrary layout. Most of the prior approaches for table detection are only applicable to PDFs, whereas, the presented approach directly works on images making it generally applicable to any format. The presented approach is based on a novel combination of deformable CNN with faster R-CNN/FPN. Conventional CNN has a fixed receptive field which is problematic for table detection since tables can be present at arbitrary scales along with arbitrary transformations (orientation). Deformable convolution conditions its receptive field on the input itself allowing it to mold its receptive field according to its input. This adaptation of the receptive field enables the network to cater for tables of arbitrary layout. We evaluated the proposed approach on two major publicly available table detection datasets: ICDAR-2013 and ICDAR-2017 POD. The presented approach was able to surpass the state-of-the-art performance on both ICDAR-2013 and ICDAR-2017 POD datasets with a F-measure of 0.994 and 0.968, respectively, indicating its effectiveness and superiority for the task of table detection.

[1] Edward A. Lee,et al. Parts that add up to a whole : a framework for the analysis of tables , 2007 .

[2] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] David Doermann,et al. Handbook of Document Image Processing and Recognition , 2014, Springer London.

[4] Miao Fan,et al. Detecting Table Region in PDF Documents Using Distant Supervision , 2015 .

[5] Clément Chatelain,et al. Learning to Detect Tables in Scanned Document Images Using Line Information , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[6] Francesca Cesarini,et al. Trainable table location in document images , 2002, Object recognition supported by user interaction for service robots.

[7] Hod Lipson,et al. Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[8] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[9] In Seop Na,et al. Table Detection from Document Image using Vertical Arrangement of Text Blocks , 2015 .

[10] Muhammad Imran Malik,et al. Table Detection Using Deep Learning , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[11] Zhi Tang,et al. A Table Detection Method for PDF Documents Based on Convolutional Neural Networks , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[12] Katharina Kaiser,et al. pdf2table: A Method to Extract Table Information from PDF Files , 2005, IICAI.

[13] Tamir Hassan,et al. ICDAR 2013 Table Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[14] Shah Khusro,et al. On methods and tools of table detection, extraction and annotation in PDF documents , 2015, J. Inf. Sci..

[15] W. Bruce Croft,et al. TINTIN: a system for retrieval in text tables , 1997, DL '97.

[16] Concetto Spampinato,et al. A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.

[17] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[18] Luís Torgo,et al. Design of an end-to-end method to extract information from tables , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[19] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[21] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22] Massimo Ruffolo,et al. PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[23] Andreas Dengel,et al. DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[24] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[25] Aurélie Lemaitre,et al. Recognition of Tables and Forms , 2014 .

[26] Thomas Kieninger,et al. Applying the T-Recs table recognition system to the business letter domain , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[27] Yi Li,et al. Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29] J. Cordy,et al. A Survey of Table Recognition : Models , Observations , Transformations , and Inferences , 2003 .

[30] Ana Costa e Silva,et al. 2009 10th International Conference on Document Analysis and Recognition Learning Rich Hidden Markov Models in Document Analysis: Table Location , 2022 .

[31] Ying Liu,et al. Dataset, Ground-Truth and Performance Metrics for Table Detection Evaluation , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[32] York Sure-Vetter,et al. Transforming arbitrary tables into logical form with TARTAR , 2007, Data Knowl. Eng..

[33] David W. Embley,et al. Table-processing paradigms: a research survey , 2006, International Journal of Document Analysis and Recognition (IJDAR).