A New Scheme for Citation Classification based on Convolutional Neural Networks

Automated classification of citation function in scientific text is a new emerging research topic inspired by traditional citation analysis in applied linguistic and scientometric fields. The aim is to classify citations in scholarly publication in order to identify author’s purpose or motivation for quoting or citing a particular paper. Several citation schemes have been proposed to classify the citations into different functions. However, it is extremely challenging to find standard scheme to classify citations, and some of the proposed schemes have similar functions. Moreover, most of previous studies mainly used classical machine learning methods such as support vector machine and neural networks with a number of manually created features. These features are incomplete and suffer from time-consuming and error prone weakness. To address these problems, we present a new citation scheme with less functions and propose a deep learning model for classification. The citation sentences and author’s information were fed to convolutional neural networks to build citation and author representations. A corpus was built using the proposed scheme and a number of experiments were carried out to assess the model. Experimental results have shown that the proposed approach outperforms the existing methods in term of accuracy, precision and recall.

[1]  Manabu Okumura,et al.  Towards Multi-paper Summarization Using Reference Information , 1999, IJCAI.

[2]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[3]  N. Nematollahi,et al.  Estimation of scale parameter under entropy loss function , 1996 .

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Zhendong Niu,et al.  A survey on sentiment analysis of scientific citations , 2019, Artificial Intelligence Review.

[6]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[7]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[8]  E GARFIELD,et al.  Citation indexes for science; a new dimension in documentation through association of ideas. , 2006, Science.

[9]  Gillian Dobbie,et al.  Verb selection using semantic role labeling for citation classification , 2013, CompSci '13.

[10]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[11]  Adam Meyers,et al.  Contrasting and Corroborating Citations in Journal Articles , 2013, RANLP.

[12]  José M. Gómez,et al.  Citation Impact Categorization: For Scientific Literature , 2015, 2015 IEEE 18th International Conference on Computational Science and Engineering.

[13]  Simone Teufel,et al.  Automatic classification of citation function , 2006, EMNLP.

[14]  Imran Sarwar Bajwa,et al.  Speech Language Processing Interface for Object-Oriented Application Design using a Rule-based Frame , 2006 .

[15]  Mohammad Abdullatif,et al.  Making the H-index more relevant: A step towards standard classes for citation classification , 2013, 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW).

[16]  Gillian Dobbie,et al.  Unsupervised Semantic and Syntactic Based Classification of Scientific Citations , 2015, DaWaK.

[17]  Hai Zhuge,et al.  Summarization of scientific documents by detecting common facts in citations , 2014, Future Gener. Comput. Syst..

[18]  Jean Carletta,et al.  An annotation scheme for discourse-level argumentation in research articles , 1999, EACL.

[19]  Daniel Jurafsky,et al.  Citation Classification for Behavioral Analysis of a Scientific Field , 2016, ArXiv.

[20]  Dragomir R. Radev,et al.  Purpose and Polarity of Citation: Towards NLP-based Bibliometrics , 2013, NAACL.

[21]  Gourab Kundu,et al.  Concept-based analysis of scientific literature , 2013, CIKM.

[22]  M. Moravcsik,et al.  Some Results on the Function and Quality of Citations , 1975 .

[23]  Myriam Hernández-Alvarez,et al.  Annotated Corpus for Citation Context Analysis , 2016 .