FFD: Figure and Formula Detection from Document Images

In this work, we present a novel and generic approach, Figure and Formula Detector (FFD) to detect the formulas and figures from document images. Our proposed method employs traditional computer vision approaches in addition to deep models. We transform input images by applying connected component analysis (CC), distance transform, and colour transform, which are stacked together to generate an input image for the network. The best results produced by FFD for figure and formula detection are with F1-score of 0.906 and 0.905, respectively. We also propose a new dataset for figures and formulas detection to aid future research in this direction. The obtained results advocate that enhancing the input representation can simplify the subsequent optimization problem resulting in significant gains over their conventional counterparts.

[1]  Robert M. Haralick,et al.  Recursive X-Y cut using bounding boxes of connected components , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[2]  Thomas Kieninger,et al.  The T-Recs Table Recognition and Analysis System , 1998, Document Analysis Systems.

[3]  Amit Kumar Das,et al.  Automated segmentation of math-zones from document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Rangachar Kasturi,et al.  Extraction of graphic primitives from images of paper based line drawings , 2005, Machine Vision and Applications.

[5]  Syed Saqib Bukhari,et al.  Document image segmentation using discriminative learning over connected components , 2010, DAS '10.

[6]  Richard Zanibbi,et al.  Recognition and retrieval of mathematical expressions , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[7]  Aisling Kelliher,et al.  NextSlidePlease: Authoring and delivering agile multimedia presentations , 2012, TOMCCAP.

[8]  Jaco Cronje Figure detection and part label extraction from patent drawing images , 2012 .

[9]  Urszula Markowska-Kaczmar,et al.  Image-based logical document structure recognition , 2014, Pattern Analysis and Applications.

[10]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Zhi-Hua Zhou,et al.  Learning to Generate Posters of Scientific Papers , 2016, AAAI.

[14]  Paul Lukowicz,et al.  D-StaR: A Generic Method for Stamp Segmentation from Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[15]  Yuan Liao,et al.  CNN Based Page Object Detection in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[16]  Zhi Tang,et al.  A Deep Learning-Based Formula Detection Method for PDF Documents , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[17]  Akiko Aizawa,et al.  Detecting In-line Mathematical Expressions in Scientific Documents , 2017, DocEng.

[18]  Muhammad Imran Malik,et al.  Table Detection Using Deep Learning , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[19]  Zhi Tang,et al.  ICDAR2017 Competition on Page Object Detection , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[20]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Andreas Dengel,et al.  DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[22]  Andreas Dengel,et al.  DeCNT: Deep Deformable CNN for Table Detection , 2018, IEEE Access.

[23]  Fei Yin,et al.  Page Object Detection from PDF Document Images by Deep Structured Prediction and Supervised Clustering , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[24]  Tam V. Nguyen,et al.  Ensemble of Deep Object Detectors for Page Object Detection , 2018, IMCOM.

[25]  Concetto Spampinato,et al.  A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents , 2018, ICIAP.