Holistic design for deep learning-based discovery of tabular structures in datasheet images

Abstract Extracting data from tabular structures contained within product datasheets is crucial in many contexts, particularly in the management and optimization of supply chains that serve various industries. In order to minimize human intervention, table detection and table structure detection form the essential functionality. However, a self-contained holistic solution to extract the tables as well as their columns and rows in not readily available. To address this challenge, This study presents a new formal procedure that consists of the following sequence: table detection, structure segmentation and holistic tabular structure detection on documents. The proposed table detection model outperforms the state-of-the-art solutions by achieving a recall value of 1.0 and a precision of more than 0.99 on public competition datasets. Furthermore, this work introduces a judging mechanism and an agreement-based post-processing procedure to incorporate hand-crafted rules into the deep learning models. Though the individual components achieve a new state-of-the-art F1-Score, when integrated the best achieved F-measure for the holistic system is 0.89.

[1]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Thomas Kieninger,et al.  An open approach towards the benchmarking of table structure recognition systems , 2010, DAS '10.

[3]  Prashant M. Ambad,et al.  Industry 4.0 – A Glimpse , 2018 .

[4]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Concetto Spampinato,et al.  A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.

[7]  Thomas Kieninger,et al.  The T-Recs Table Recognition and Analysis System , 1998, Document Analysis Systems.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[10]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[11]  Francesca Cesarini,et al.  Trainable table location in document images , 2002, Object recognition supported by user interaction for service robots.

[12]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Benny Tjahjono,et al.  What does Industry 4.0 mean to Supply Chain , 2017 .

[14]  Yonina C. Eldar,et al.  A probabilistic Hough transform , 1991, Pattern Recognit..

[15]  Stephen V. Rice,et al.  The Fourth Annual Test of OCR Accuracy , 1995 .

[16]  Goran Nenadic,et al.  A framework for information extraction from tables in biomedical literature , 2019, International Journal on Document Analysis and Recognition (IJDAR).

[17]  Dominika Tkaczyk,et al.  CERMINE: automatic extraction of structured metadata from scientific literature , 2015, International Journal on Document Analysis and Recognition (IJDAR).

[18]  Massimo Ruffolo,et al.  PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[19]  Faisal Shafait,et al.  Table detection in heterogeneous documents , 2010, DAS '10.

[20]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21]  Muhammad Imran Malik,et al.  Table Detection Using Deep Learning , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ruiheng Qiu,et al.  A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures , 2011, 2011 International Conference on Document Analysis and Recognition.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Zhi Tang,et al.  A Table Detection Method for PDF Documents Based on Convolutional Neural Networks , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[27]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[28]  In Seop Na,et al.  Table Detection from Document Image using Vertical Arrangement of Text Blocks , 2015 .

[29]  Azriel Rosenfeld,et al.  Document structure analysis algorithms: a literature survey , 2003, IS&T/SPIE Electronic Imaging.

[30]  Andreas Dengel,et al.  DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[31]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .