DeepBiRD: An Automatic Bibliographic Reference Detection Approach

The contribution of this paper is two fold. First, it presents a novel approach called DeepBiRD which is inspired from human visual perception and exploits layout features to identify individual references in a scientific publication. Second, we present a new dataset for image-based reference detection with 2401 scans containing 12244 references, all manually annotated for individual reference. Our proposed approach consists of two stages, firstly it identifies whether given document image is single column or multi-column. Using this information, document image is then splitted into individual columns. Secondly it performs layout driven reference detection using Mask R-CNN in a given scientific publication. DeepBiRD was evaluated on two different datasets to demonstrate the generalization of this approach. The proposed system achieved an F-measure of 0.96 on our dataset. DeepBiRD detected 2.5 times more references than current state-of-the-art approach on their own dataset. Therefore, suggesting that DeepBiRD is significantly superior in performance, generalizable and independent of any domain or referencing style.

[1]  Jan-Ming Ho,et al.  BibPro: A Citation Parser Based on Sequence Alignment , 2012, IEEE Trans. Knowl. Data Eng..

[2]  Rohit Gupta,et al.  ParsRec: A Novel Meta-Learning Approach to Recommending Bibliographic Reference Parsers , 2018, AICS.

[3]  Andreas Dengel,et al.  DeepBIBX: Deep Learning for Image Based Bibliographic Data Extraction , 2017, ICONIP.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Patrice Lopez,et al.  GROBID: Combining Automatic Bibliographic Data Recognition and Term Extraction for Scholarship Publications , 2009, ECDL.

[6]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7]  Klemens Böhm,et al.  Improved bibliographic reference parsing based on repeated patterns , 2014, International Journal on Digital Libraries.

[8]  Philipp Zumstein,et al.  Labeled Reference Data from the Linked Open Citation Database (LOC-DB) Project , 2018 .

[9]  Andreas Dengel,et al.  Benchmarking Object Detection Networks for Image Based Reference Detection in Document Images , 2019, 2019 Digital Image Computing: Techniques and Applications (DICTA).

[10]  Atsuhiro Takasu,et al.  Examination of effective features for CRF-based bibliography extraction from reference strings , 2016, 2016 Eleventh International Conference on Digital Information Management (ICDIM).

[11]  Andreas Dengel,et al.  Linked Open Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph , 2018, JCDL.

[12]  Manpreet Kaur,et al.  Neural ParsCit: a deep learning-based reference string parser , 2018, International Journal on Digital Libraries.

[13]  C. Lee Giles,et al.  ParsCit: an Open-source CRF Reference String Parsing Package , 2008, LREC.

[14]  Jie Zou,et al.  Locating and parsing bibliographic references in HTML medical articles , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[15]  Adèle Paul-Hus,et al.  The journal coverage of Web of Science and Scopus: a comparative analysis , 2015, Scientometrics.

[16]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Dominika Tkaczyk,et al.  CERMINE: automatic extraction of structured metadata from scientific literature , 2015, International Journal on Document Analysis and Recognition (IJDAR).