A Preliminary Study to Compare Deep Learning with Rule-based Approaches for Citation Classification

Categorization of semantic relationships between scientific papers is a key to characterize the condition of a research field and to identify influential works. Recently, new approaches based on Deep Learning have demonstrated good capacities to tackle Natural Language Processing problems, such as text classification and information extraction. In this paper, we show how deep learning algorithms can automatically learn to classify citations, and could provide a relevant alternative when compared with methods based on pattern extractions from the recent state of the art. The paper discusses their appropriateness given the requirement of large datasets to train neural networks.

[1]  Angelo Di Iorio,et al.  Evaluating Citation Functions in CiTO: Cognitive Issues , 2014, ESWC.

[2]  Frederick Reiss,et al.  Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems! , 2013, EMNLP.

[3]  Silvio Peroni,et al.  FaBiO and CiTO: Ontologies for describing bibliographic resources and citations , 2012, J. Web Semant..

[4]  In-Cheol Kim,et al.  Automated classification of author's sentiments in citation using machine learning techniques: A preliminary study , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[5]  Sadid A. Hasan,et al.  Towards Automatic Topical Question Generation , 2012, COLING.

[6]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[7]  Saeed-Ul Hassan,et al.  Mining the Context of Citations in Scientific Publications , 2018, ICADL.

[8]  Simone Teufel,et al.  Automatic classification of citation function , 2006, EMNLP.

[9]  Karsten Weihe,et al.  Improve Sentiment Analysis of Citations with Author Modelling , 2016, WASSA@NAACL-HLT.

[10]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[11]  Holger H. Hoos,et al.  Patterns in citation context: the case of the field of scientometrics , 2018 .

[12]  Achim G. Hoffmann,et al.  A New Approach for Scientific Citation Classification Using Cue Phrases , 2003, Australian Conference on Artificial Intelligence.

[13]  Daniel Jurafsky,et al.  Measuring the Evolution of a Scientific Field through Citation Frames , 2018, TACL.

[14]  Dragomir R. Radev,et al.  Purpose and Polarity of Citation: Towards NLP-based Bibliometrics , 2013, NAACL.

[15]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[16]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.